60
The effectiveness of eight nonpharmaceutical interventions against COVID-19 in 41 countries Jan M. Brauner MD @,*, 1,2 , Sören Mindermann *, 1 , Mrinank Sharma *, 3 , Anna B. Stephenson 4 , Tomáš Gavenˇ ciak PhD 5 , David Johnston 6,7 , Gavin Leech 8 , John Salvatier 7 , George Altman MBChB 9 , Alexander John Norman 5 , Joshua Teperowski Monrad 2 , Tamay Besiroglu 10 , Hong Ge PhD 11 , Vladimir Mikulik 5 , Meghan A. Hartwick PhD 12 , Prof Yee Whye Teh PhD 13 , Prof Leonid Chindelevitch PhD 14 , Prof Yarin Gal PhD +, 1 , Jan Kulveit PhD +, 2 1 OATML, Department of Computer Science, University of Oxford, Oxford, United Kingdom. 2 Future of Humanity Institute, University of Oxford, Oxford, United Kingdom. 3 Department of Computer Science, University of Oxford, Oxford, United Kingdom. 4 Harvard John A. Paulson School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts, USA. 5 No affiliation. 6 College of Engineering and Computer Science, Australian National University, Australia. 7 Quantified Uncertainty Research Institute, California, USA. 8 School of Computer Science, University of Bristol, Bristol, United Kingdom. 9 School of Medical Sciences, University of Manchester, Manchester, United Kingdom. 10 Faculty of Economics, University of Cambridge, Cambridge, United Kingdom. 11 Engineering Department, University of Cambridge, Cambridge, United Kingdom. 12 Tufts Initiative for the Forecasting and Modeling of Infectious Diseases, Tufts University, Boston, USA. 13 Department of Statistics, University of Oxford, Oxford, United Kingdom. 14 Computational Epidemiology Lab, School of Computing Science, Simon Fraser University, Burnaby, Canada. Abstract Background. Governments are attempting to control the COVID-19 pandemic with non- pharmaceutical interventions (NPIs). However, it is still largely unknown how effective different NPIs are at reducing transmission. Data-driven studies can estimate the effective- ness of NPIs while minimizing assumptions, but existing analyses lack sufficient data and validation to robustly distinguish the effects of individual NPIs. Methods. We collect chronological data on NPIs in 41 countries between January and May 2020, using independent double entry by researchers to ensure high data quality. We esti- mate NPI effectiveness with a Bayesian hierarchical model, by linking NPI implementation dates to national case and death counts. To our knowledge, this is the largest and most thoroughly validated data-driven study of NPI effectiveness to date. @ Correspondence to [email protected] * Equal contribution + Contributed equally to senior authorship This work was conducted in association with the EpidemicForecasting.org project . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted July 23, 2020. ; https://doi.org/10.1101/2020.05.28.20116129 doi: medRxiv preprint NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice.

The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

The effectiveness of eight nonpharmaceutical interventions againstCOVID-19 in 41 countries

Jan M Brauner MDlowast 12 Soumlren Mindermannlowast1 Mrinank Sharmalowast3 Anna BStephenson4 Tomaacuteš Gavenciak PhD5 David Johnston67 Gavin Leech8 John Salvatier7George Altman MBChB9 Alexander John Norman5 Joshua Teperowski Monrad2 TamayBesiroglu10 Hong Ge PhD11 Vladimir Mikulik5 Meghan A Hartwick PhD12 Prof YeeWhye Teh PhD13 Prof Leonid Chindelevitch PhD14 Prof Yarin Gal PhD+ 1 Jan KulveitPhD+2

1OATML Department of Computer Science University of Oxford Oxford United Kingdom2Future of Humanity Institute University of Oxford Oxford United Kingdom3Department of Computer Science University of Oxford Oxford United Kingdom4Harvard John A Paulson School of Engineering and Applied Sciences Harvard University CambridgeMassachusetts USA5No affiliation6College of Engineering and Computer Science Australian National University Australia7Quantified Uncertainty Research Institute California USA8School of Computer Science University of Bristol Bristol United Kingdom9School of Medical Sciences University of Manchester Manchester United Kingdom10Faculty of Economics University of Cambridge Cambridge United Kingdom11Engineering Department University of Cambridge Cambridge United Kingdom12Tufts Initiative for the Forecasting and Modeling of Infectious Diseases Tufts University Boston USA13Department of Statistics University of Oxford Oxford United Kingdom14Computational Epidemiology Lab School of Computing Science Simon Fraser University Burnaby Canada

Abstract

Background Governments are attempting to control the COVID-19 pandemic with non-pharmaceutical interventions (NPIs) However it is still largely unknown how effectivedifferent NPIs are at reducing transmission Data-driven studies can estimate the effective-ness of NPIs while minimizing assumptions but existing analyses lack sufficient data andvalidation to robustly distinguish the effects of individual NPIs

Methods We collect chronological data on NPIs in 41 countries between January and May2020 using independent double entry by researchers to ensure high data quality We esti-mate NPI effectiveness with a Bayesian hierarchical model by linking NPI implementationdates to national case and death counts To our knowledge this is the largest and mostthoroughly validated data-driven study of NPI effectiveness to date

Correspondence to janbraunerengoxacuklowastEqual contribution+Contributed equally to senior authorshipThis work was conducted in association with the EpidemicForecastingorg project

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

NOTE This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice

Results We model each NPIrsquos effect as a multiplicative (percentage) reduction in the repro-duction number R We estimate the mean reduction in R across the countries in our data foreight NPIs mandating mask-wearing in (some) public spaces (2 95 CI -14ndash16)limiting gatherings to 1000 people or less (2 -20ndash22) to 100 people or less (211ndash39) to 10 people or less (36 16ndash53) closing some high-risk businesses (3113ndash46) closing most nonessential businesses (40 22ndash55) closing schools anduniversities (39 21ndash55) and issuing stay-at-home orders (18 4ndash31) Theseresults are supported by extensive empirical validation including 15 sensitivity analyses

Conclusions Our results suggest that by implementing effective NPIs many countries canreduce R below 1 without issuing a stay-at-home order We find a surprisingly large rolefor school and university closures in reducing COVID-19 transmission a contribution to theongoing debate about the relevance of asymptomatic carriers in disease spread Banninggatherings and closing high-risk businesses can be highly effective in reducing transmissionbut closing most businesses only has limited additional benefit

2

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Introduction

Worldwide governments have mobilised vast resources to fight the COVID-19 pandemic Awide range of nonpharmaceutical interventions (NPIs) has been deployed including drasticmeasures like stay-at-home orders and the closure of all nonessential businesses Recentanalyses show that these large-scale NPIs are jointly effective at reducing the virusrsquo effectivereproduction number1 but the effects of individual NPIs are still largely unknown As timeprogresses and more data become available we can move beyond estimating the combinedeffect of a bundle of NPIs and begin to understand the effects of individual interventionsThis can help governments efficiently control the epidemic while removing less effectiveNPIs to ease the burden put on the population

A promising way to estimate NPI effectiveness is data-driven cross-country modelling in-ferring effectiveness by relating the NPIs implemented in different countries to the courseof the epidemic in these countries To disentangle the effects of individual NPIs we needto leverage data from multiple regions with diverse sets of interventions in place Withsome exceptions1ndash4 previous data-driven studies focus on single NPIs or single geographi-cal regions (Table F4) In contrast we evaluate the impact of eight NPIs on the epidemicrsquosgrowth in 34 European and 7 non-European countries To our knowledge this is the largestdata-driven study of NPI effects on COVID-19 transmission to date The data gathered ispublicly available

To isolate the effect of individual NPIs we also require sufficiently diverse data If all coun-tries implemented the same set of NPIs on the same day the individual effect of each NPIwould be unidentifiable However the COVID-19 response was far less coordinated coun-tries implemented different sets of NPIs at different times in different orders (Figure 1)

Even with diverse data from many countries estimating NPI effects remains a challengingtask First many components of a model such as epidemiological parameters and interac-tions between NPIs are only known with high uncertainty Two recent replication studiesdemonstrated that NPI effectiveness estimates can be highly sensitive to arbitrary modellingdecisions5 especially when based on insufficient data6 Second the data are retrospectiveand observational meaning that unobserved factors could confound the results Thirdlarge-scale public NPI datasets suffer from frequent inconsistencies7 and missing data8 Forthese reasons the data and the model should be carefully validated Insufficiently validatedresults should not be used to guide policy decisions We perform to our knowledge by farthe most extensive validation of any COVID-19 NPI effectiveness results to date a crucialbut largely absent or incomplete element of NPI effectiveness studies5

3

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Universities closedSchools closedMask-wearing

Most businesses closed Stay-at-home orderSome businesses closed

Figure 1 Timing of NPI implementations in early 2020 Crossed-out symbols signify when an NPI waslifted Detailed definitions of the NPIs are given in Table 1

4

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Methods

Dataset

We analyse the effects of NPIs (Table 1) in 41 countriesa (see Figure 1) We recorded NPIimplementations when the measures were implemented nationally or in most regions of acountry (affecting at least three fourths of the population) For each country the window ofanalysis starts on the 22nd of January and ends after the first NPI was lifted or on the 30thof May 2020 whichever was earlier The reason to end the analysis after the first majorreopeningb was to avoid a distribution shift For example when schools reopened it wasoften with safety measures such as smaller class sizes and distancing rules It is thereforeexpected that contact patterns in schools will have been different before school closurecompared to after reopening Modelling this difference explicitly is left for future workData on confirmed COVID-19 cases and deaths were taken from the Johns Hopkins CSSECOVID-19 Dataset9 The data used in this study including sources are available onlinehere

aThe countries were selected for their number of cumulative cases surpassing a minimum threshold (at thetime of modelling) the availability of reliable data on NPIs and the trustworthiness of their reporting of deathsFinally we excluded very large countries like China the US and Canada for ease of data collection as thesewould require more locally fine-grained data

bConcretely the window of analysis extended until three days after the first reopening for confirmed casesand 13 days after the first reopening for deaths These values correspond to the 5 quantile of the infection-to-confirmationdeath distributions ensuring that less than 5 of the new infections on the reopening day werestill observed in the window of analysis

5

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table 1 NPIs included in the study Appendix G details how edge cases in the data collection werehandled

NPI DescriptionMask-wearingmandatory in(some) publicspaces

A country has mandated mask usage in the public sometimes limitedto just some public spaces (which the government deems to have a highrisk of infection) For example some countries mandated mask-wearingin most or all indoor public spaces but not outdoors

Gatheringslimited to 1000people or less

A country has set a size limit on gatherings The limit is at most 1000people (often less) and gatherings above the maximum size are disal-lowed For example a ban on gatherings of 500 people or more wouldbe classified as ldquogatherings limited to 1000 or lessrdquo but a ban on gath-erings of 2000 people or more would not

Gatheringslimited to 100people or less

A country has set a size limit on gatherings The limit is at most 100people (often less)

Gatheringslimited to 10people or less

A country has set a size limit on gatherings The limit is at most 10people (often less)

Some businessesclosed

A country has specified a few kinds of customer-facing businesses thatare considered ldquohigh riskrdquo and need to suspend operations (black-list) Common examples are restaurants bars nightclubs cinemasand gyms By default businesses are not suspended

Most nonessentialbusinesses closed

A country has suspended the operations of many customer-facing busi-nesses By default customer-facing businesses are suspended unlessthey are designated as essential (whitelist)

Schools closed A country has closed most or all schoolsUniversitiesclosed

A country has closed most or all universities and higher education fa-cilities

Stay-at-homeorder (withexemptions)

An order for the general public to stay at home has been issued This ismandatory not just a recommendation Exemptions are usually grantedfor certain purposes (such as shopping exercise or going to work) ormore rarely for certain times of the day In practice a stay-at-homeorder was often accompanied by other NPIs such as businesses closuresHowever a stay-at-home order does not in principle entail these otherNPIs but only the (additional) order to generally stay at home exceptfor exemptions

6

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Data collection

We collected data on the start and end date of NPI implementations from the start of thepandemic until the 30th of May 2020 Before collecting the data we experimented withseveral public NPI datasets finding that they were not complete enough for our modellingand contained incorrect datesc By focusing on a smaller set of countries and NPIs than thesedatasets we were able to enforce strong quality controls We used independent doubleentry and manually compared our data to public datasets for cross-checking

First two authors independently researched each country and entered the NPI data into sep-arate spreadsheets The researchers manually researched the dates using internet searchesthere was no automatic component in the data gathering process The average time spentresearching each country per researcher was 15 hours

Second the researchers independently compared their entries to the following public datasetsand if there were conflicts visited all primary sources to resolve the conflict the EFGNPIdatabase10 the Oxford COVID-19 Government Response Tracker11 and the mask4all dataset12

Third each country and NPI was again independently entered by one to three paid con-tractors who were provided with a detailed description of the NPIs and asked to includeprimary sources with their data A researcher then resolved any conflicts between this dataand one (but not both) of the spreadsheets

Finally the two independent spreadsheets were combined and all conflicts resolved by aresearcher The final dataset contains primary sources (government websites andor mediaarticles) for each entry

Data Preprocessing

Data on cases and deaths are noisy Many countries preferentially report deaths and cases oncertain days of the week For example there are days with zero newly confirmed cases eventhough there had been several hundred reported cases on the previous day We thereforesmooth the data using a moving average (on a linear scale) over 2 days into the past andfuture When the case count is small a large fraction of cases may be imported from other

cWe evaluated the following datasetsbull Epidemic Forecasting Global NPI Database10

bull Oxford COVID-19 Government Response Tracker (OxCGRT)11

bull ACAPS COVID19 Government Measures DatasetNote that these datasets are under continuous development Many of the mistakes found will already have beencorrected We know from our own experience that data collection can be very challenging We have the fullestrespect for the people behind these datasets In this paper we focus on a more limited set of countries and NPIsthan these datasets contain allowing us to ensure higher data quality in this subset Given our experience withpublic datasets and our data collection we encourage fellow COVID-19 researchers to independently verify thequality of public data they use if feasible

7

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

countries and the testing regime may change rapidly To prevent this from biasing ourmodel we neglect case numbers before a country has reached 100 confirmed cases anddeath numbers before a country has reached 10 deaths We include all preprocessing stepsin our sensitivity analysis (Appendix C2)

Model

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure 2 Model Overview Purple nodes are observed or have a fixed distribution From bottom to topThe effectiveness of intervention i is represented by αi On each day t a countryrsquos daily reproductionnumber Rt c depends on the countryrsquos basic reproduction number R0c and the active NPIs The activeNPIs are encoded by Φi t c which is 1 if NPI i is active in country c at time t and 0 otherwise Rt c istransformed into the daily growth rate gt c which also depends on the generation interval The growthrate is used to compute the new infections N (C )

t c and N (D)t c that will later be registered as confirmed cases

Ct c and deaths Dt c respectively after a delay Our model uses both death and case data it splits allnodes above the daily growth rate gt c into separate branches for deaths and confirmed cases

Our model uses case and death data from each country to lsquobackwardsrsquo infer the number ofnew infections at each point in time which is itself used to infer the reproduction numbersNPI effects are then estimated by relating the daily reproduction numbers to the activeNPIs across all days and countries This relatively simple lsquodata-drivenrsquo approach allows usto sidestep assumptions about contact patterns and intensity infectiousness of different agegroups and so forth that are typically required in modelling studies Our semi-mechanisticBayesian hierarchical model is based on that of Flaxman et al1 extended to use both caseand death data This increases the amount of data from which we can extract NPI effectsreduces distinct biases of case and death reporting and reduces the bias of only including

8

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

countries that have many deaths Additionally as we do not aim to infer the total number ofCOVID-19 infections we do not assume a specific infection fatality rate (IFR) or ascertain-ment rate (rate of testing) We proceed by summarising the model (Figure 2) A detaileddescription is given in Appendix A Code is available online here

The growth of the epidemic is determined by the time- and country-specific reproductionnumber Rt c which depends on a) the (unobserved) basic reproduction number R0c givenno active NPIs and b) the active NPIs at time t R0c accounts for all time-invariant factorsthat affect transmission in country c such as differences in demographics population den-sity culture and health systems13 We assume that the effect of each NPI on Rt c is stableacross countries and time The effectiveness of NPI i is represented by a parameter αi over which we place a symmetric prior with mean zero allowing both positive and negativeeffects Following Flaxman et al and others1ndash3 each NPIrsquos effect on Rt c is assumed toindependently affect Rt c as a multiplicative factor

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (1)

where φi ct = 1 indicates that NPI i is active in country c on day t (φi ct = 0 otherwise) andI is the number of NPIs The multiplicative effect encodes the plausible assumption thatNPIs have a smaller absolute effect when Rt c is already low We discuss the meaning ofeffectiveness estimates given NPI interactions in the Results section

In the early phase of an epidemic the number of new daily infections grows exponentiallyDuring exponential growth there is a one-to-one correspondence between the daily growthrate and Rt c 14 The correspondence depends on the generation interval (the time betweensuccessive infections in a chain of transmission) which we assume to have a Gamma dis-tribution with mean 667 days11516 We model the daily new infection count separatelyfor confirmed cases and deaths representing those infections which are later reported andthose which are later fatal However both infection numbers are assumed to grow at thesame daily rate in expectation allowing the use of both data sources to estimate each αi The infection numbers translate into reported confirmed cases and deaths after a stochasticdelay which is assumed to be equal across countries The delay is the sum of two indepen-dent gamma distributions assumed to be equal across countries the incubation period andthe delay from onset of symptoms to confirmation We use previously published empiricaldistributions from China and Italy16ndash19 which mutually agree and give a mean infection-to-confirmation delay of 1035 days Similarly the infection-to-death delay is the sum of theincubation period and the (gamma distributed) delay from onset of symptoms to death1720

which sum up to a mean delay of 229 days Finally both the reported deaths and casesfollow a negative binomial noise distribution an inferred dispersion parameter as in relatedNPI models13

Using a Markov chain Monte Carlo (MCMC) sampling algorithm21 this model infers poste-rior distributions of each NPIrsquos effectiveness while accounting for cross-country variationsin testing reporting and fatality rates However it relies on the key assumptions that NPIs

9

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

have equal effects across countries and time and that changes in R are due to the observedNPIs To analyse the extent to which modelling choices affect the results our sensitivityanalysis includes all epidemiological parameters prior distributions and many of the struc-tural assumptions introduced above (Appendix B2 and Appendix C) MCMC convergencestatistics are given in Appendix C

Results

NPI Effectiveness

Our model enables us to estimate the individual effectiveness of each NPI expressed asa percentage reduction in R As in related work1ndash3 this percentage reduction is modelledas constant over countries and time and independent of the other implemented NPIs Inpractice however NPI effectiveness may depend on other implemented NPIs and localcircumstances Thus our effectiveness estimates ought to be interpreted as the effectivenessaveraged over the contexts in which the NPI was implemented in our data5 Our results thusgive the average NPI effectiveness across typical situations that the NPIs were implementedin Figure 3 (bottom left) visualizes which NPIs typically co-occurred aiding interpretation

The mean percentage reduction in R (with 95 credible interval) associated with each NPIis as follows (Figure 3) mandating mask-wearing in (some) public spaces 2 (-14ndash16) limiting gatherings to 1000 people or less 2 (-20ndash22) to 100 people or less21 (1ndash39) to 10 people or less 36 (16ndash53) closing some high-risk businesses31 (13ndash46) closing most nonessential businesses 40 (22ndash55) closing schoolsand universities 39 (21ndash55) and issuing stay-at-home orders 18 (4ndash31)

Some NPIs frequently co-occur ie are collinear However we are able to isolate the ef-fects of individual NPIs since the collinearity is imperfect and our dataset is large For everypair of NPIs we observe one of them without the other for 748 country-days on average(Appendix D3) The minimum number of country-days for any NPI pair is 143 (for lim-iting gatherings to 1000 or 100 attendees) Additionally under excessive collinearity andinsufficient data to overcome it individual effectiveness estimates are highly sensitive tovariations in the data and model parameters22 High sensitivity prevented Flaxman et al1

who had a smaller dataset from disentangling NPI effects6 Our estimates are substantiallyless sensitive (see below) Finally the posterior correlations between the effectiveness esti-mates are weak suggesting manageable collinearity (Appendix D4)

Although the correlations between the individual estimates are weak we should take theminto account when evaluating combined effects of NPIs For example if two NPIs fre-quently co-occur there may be more certainty about the combined effect than about thetwo individual effects Figure 4 shows the combined effectiveness of the sets of NPIs thatare most common in our data All NPIs together reduce R by 82 (79ndash85) Acrossour countries the mean R without any NPIs (ie R0) is 38 matching the mean result ofFlaxman et al1 (Table D2 reports R0 for all countries) Starting from this number the esti-

10

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-100 -50 0 50 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spaces

Gatherings limited to1000 people or less

Gatherings limited to100 people or less

Gatherings limited to10 people or less

Some businessesclosed

Most nonessentialbusinesses closed

Schools and universitiesclosed

Stay-at-home order(with exemptions)

北 便

i

j

Frequency i Active Given j Active

25

50

75

100

0 500 1000 1500 2000 2500 3000

Days

Total Days Active

Figure 3 Top NPI effects The Figure shows the average percentage reductions in R as observed inour data (or in terms of the model the posterior marginal distributions of 1minusexp(minusαi )) with median50 and 95 credible intervals A negative 1 reduction refers to a 1 increase in R Cumulativeeffects are shown for hierarchical NPIs (gathering bans and business closures) ie the result for Mostnonessential businesses closed shows the cumulative effect of two NPIs with separate parameters andsymbols - closing some (high-risk) businesses and additionally closing most remaining (non-high-riskbut nonessential) businesses given that some businesses are already closed Finally we show the jointeffect of closing both schools and universities because the dates of school and university closures nearlyperfectly coincide in our data and we cannot meaningfully isolate their individual effects (AppendixD2) Bottom Left Conditional activation matrix Cell values indicate the frequency that NPI i (x-axis)was active given that NPI j (y-axis) was active Eg schools were always closed whenever a stay-at-home order was active (bottom row third column from the right) but not vice versa Bottom RightTotal number of days each NPI was active across all countries

11

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearingMost businesses closed Schools Closed Universities Closed Stay-at-home order

Some businessesclosed

Figure 4 Combined NPI effectiveness for the most common sets of NPIs in our data by size of the NPI setShaded regions denote 50 and 95 credible intervals Left Maximum R0 that can be reduced to below1 for each set of NPIs Right Predicted R after implementation of each set of NPIs assuming R0 = 38Readers can interactively explore the effects of all sets of NPIs at httpepidemicforecastingorgcalc

mated R can be reduced below 1 by closing schools and universities high-risk businessesand limiting gathering sizes Readers can interactively explore the effects of sets of NPIsat httpepidemicforecastingorgcalc A CSV file containing the joint effectiveness of allpossible NPI combinations is available online here

Validation

We perform a range of experiments to study the robustness and calibration of our NPI effec-tiveness estimates (Appendix B with further experiments in Appendix C) We analyse howthe model extrapolates to unseen countries and periods and perform multiple sensitivityanalyses Amongst other things we analyse how results change if we vary epidemiologicalparameters or vary the data (using only deaths or confirmed cases as observations ex-cluding countries from the data) To investigate our key assumptions we show results forseveral alternative models (structural sensitivity) analyze the role of NPI timing and ex-amine possible confounding of our estimates by unobserved factors influencing R Figure 5summarises these analyses by showing how each NPIrsquos effectiveness is ranked comparedto other NPIs and how its rank is distributed across all experiment conditions The strongagreement across these many analyses increases our confidence in the results while alsoshowing that the precise effectiveness estimates come with additional uncertainty

Discussion

We use a data-driven approach to estimate the effects of eight nonpharmaceutical interven-tions on COVID-19 transmission in 41 countries All eight NPIs together reduce R by 82

12

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

12345678NPI Median Effectiveness Ranking

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Global NPI Rank Stability

25

50

75

100

Figure 5 Ranking of NPIs by median effectiveness across all sensitivity analyses (15 sensitivity analyseswith a total of 96 experiment conditions) In each analysis NPI effects were estimated under severaldifferent plausible variations of the model or the data (Appendix B and Appendix C) The colourindicates in which fraction of all experiment conditions an NPI occupied a given rank The sensitivityanalyses aggregated in this Figure are shown in Figures B8 B9 C11 C12 C13 and B7A

(79ndash85) This finding is in strong agreement with the joint effect estimated in elevencountries by Flaxman et al1 and contributes to the mounting evidence that NPIs can beeffective at mitigating and suppressing outbreaks of COVID-19 Furthermore our resultssuggest that some NPIs outperform others While the exact effectiveness estimates varymildly the qualitative conclusions discussed below are robust across 15 sensitivity analyses

Business closures and gathering bans both seem effective at reducing COVID-19 transmis-sion Closing only high-risk businesses (mean reduction in R 31) appears only somewhatless effective than closing most nonessential businesses (40) making it the more promis-ing policy option in some circumstances Limiting gatherings to 10 people or less (36) wasmore effective than limits up to 100 (21) or 1000 people (2) This may reflect that smallgatherings are common As previously discussed we estimate the average additional effecteach NPI had in the contexts where it was implemented When countries introduced stay-at-home orders they nearly always also banned gatherings and closed schools universitiesand nonessential businesses if they had not done so already Flaxman et al1 and Hsianget al4 add the effect of these distinct NPIs to the effectiveness of stay-at-home orders andaccordingly find a large effect In contrast we and Banholzer et al3 isolate the additionaleffect of ordering the population to stay at home and instead find a smaller effect (18) Atypical country can reduce R to below 1 without a stay-at-home order (Figure 4) providedother NPIs stay active

Mandating mask-wearing in various public spaces had a small positive effect on averagein the countries we studied (2) This does not rule out that mask-wearing has a largereffect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaces

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

wearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier223 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 and open questionsabout different types of masks play a larger role potentially reducing effectiveness Whileour results cast doubt on reports that mask-wearing is the main determinant shaping acountryrsquos epidemic23 the policy still seems promising given all available evidence due toits comparatively low economic and social costs Its effectiveness may increase as otherNPIs are lifted and public interactions return

We find a surprisingly large effect for school and university closures an average 39 re-duction in R This finding is remarkably robust across different model structures variationsin the data and epidemiological assumptions (Figure B7) It remains robust when control-ling for NPIs excluded from our study and the onset time of major government intervention(Figures B9 and C10) Since school and university closures almost perfectly coincide inthe countries we study an approach such as ours cannot distinguish their individual effects(Appendix D2) Furthermore it cannot distinguish direct and indirect effects such asforcing parents to stay at home or causing broader behaviour changes by increasing publicconcern

Previous evidence on school and university closures is mixed1326 Early data suggest thatchildren and young adults are equally susceptible to infection but have a notably lowerobserved case rate than older adultsmdashwhether this is due to school and university closuresremains unknown27ndash29 Although infected young people are often asymptomatic they ap-pear to shed similar amounts of virus as older people3031 and might therefore circulatethe infection to higher-risk demographics unknowingly This also limits our ability to detectlarge outbreaks in educational facilities which closed in nearly all countries before suchdetection was feasible (with exceptions32) As outbreaks detected in UK schools are rapidlyincreasing33 this topic merits further study

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset5

and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B2) However this cannot give final certainty Investigating the role of unob-served effects is an important topic for future investigations Third our results cannot beused without qualification to predict the effect of lifting NPIs For example closing schoolsand universities seems to have greatly reduced transmission but this does not mean that re-opening them will cause infections to soar Educational institutions can (and do) implementsafety measures such as reduced class sizes Further work is needed to analyse the effectsof reopenings our collected data may be instrumental Fourth while we included more

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

NPIs than previous work (Table F4) several promising NPIs were excluded For exampletesting tracing and case isolation may be an important part of a cost-effective epidemicresponse34 but were not included because it is difficult to obtain comprehensive data Wediscuss further limitations in Appendix E

Currently governments across the world are seeking to keep R below 1 while minimising thesocial and economic costs of their interventions We hope that our results can guide policydecisions on which restrictions to lift and which NPIs to implement in any potential secondwave of infections Additionally our results show which areas of public life are most inneed of restructuring so that they can continue despite the pandemic However our resultsshould not be seen as the final answer on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies experimental trialssimulations and clinical experience

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

We thank Jacob Lagerros for operational support and for introducing some of the authorsto each other We thank Maksym Balatsko Marek Pukaj and Tomaacuteš Witzany for developingthe interactive website

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier MA Hartwick LChindelevitch designed the model and modelling experiments M Sharma AB StephensonT Gavenciak J Salvatier performed and analysed the modelling experiments J KulveitT Gavenciak JM Brauner conceived the research S Mindermann T Besiroglu J KulveitJM Brauner did the literature search JM Brauner S Mindermann M Sharma G LeechT Besiroglu V Mikulik wrote the manuscript All authors read and gave feedback on themanuscript and approved the final manuscript JM Brauner S Mindermann and M Sharmacontributed equally Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

3 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

4 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

5 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

6 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

7 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

8 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

9 Johns Hopkins University Center for Systems Science and Engineering COVID-19 DataRepository by the Center for Systems Science and Engineering (CSSE) at Johns HopkinsUniversity httpsgithubcomCSSEGISandDataCOVID-19 2020

10 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

11 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

12 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

13 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

14 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

15 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

19 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

20 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

28 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

29 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

30 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

31 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

32 Arnaud Fontanet et al ldquoCluster of COVID-19 in northern France A retrospective closedcohort studyrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041820071134

33 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 26 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile895356Weekly_COVID19_Surveillance_Report_w26pdf2020

34 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 21

Appendix A1 Detailed model description 21

Appendix A2 Technical Model Description 24

Appendix B Validation 28

Appendix B1 Unseen data 28

Appendix B2 Sensitivity Analysis 28

Appendix B3 Robustness to unobserved effects 31

Appendix C Additional sensitivity analyses and validation 33

Appendix C1 Role of NPI timing 33

Appendix C2 Sensitivity to data preprocessing 34

Appendix C3 Sensitivity to additional epidemiological assumptions 35

Appendix C4 Additional country exclusions 37

Appendix C5 Validation using predictions in excluded countries 38

Appendix C6 MCMC stability results 43

Appendix C7 Posterior predictive distributions 44

Appendix D Additional results 45

Appendix D1 Estimated R0 by country 45

Appendix D2 The individual effects of school and university closures 46

Appendix D3 Collinearity 46

Appendix D4 Correlations between effectiveness estimates 47

Appendix E Additional discussion of assumptions and limitations 49

Appendix E1 Limitations of the data 49

Appendix E2 Model limitations 49

Appendix F Overview of previous work 50

Appendix G Handling edge cases in the data collection 53

Appendix H References 57

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure A6 Model Overview Purple nodes are observed or have a fixed distribution We describe thediagram from bottom to top Each NPIrsquos effectiveness is characterised by αi which is independent ofthe country On each day a countryrsquos daily reproduction number Rt c only depends on that countryrsquosbase reproduction number R0c and the active NPIs (Φi t c ) Rt c is transformed to the daily growth rategt c which is used to compute the new infections N (C )

t c and N (D)t c that will turn into confirmed cases

and deaths respectively Finally the number of new confirmed cases Ct c and deaths Dt c is computedby convolution of N (middot)

t c with the respective delay distributions The same model structure is used forconfirmed cases and deaths The model combines both observations it splits all nodes above the dailygrowth rate gt c into separate branches for deaths and cases

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1 Themain difference is that we model both confirmed cases and deaths allowing us to leveragesignificantly more data Furthermore we do not assume a specific infection fatality rate(IFR) since we do not aim to infer the total number of COVID-19 infections The end of thissection details further adaptations which allow us to minimize assumptions about testingreporting and the IFR A list of all technical details is given in Appendix A2

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior (and hyperprior) distribution over R0c reflecting the wide disagree-ment of regional estimates of R02 We parameterize the effectiveness of NPI i assumed to

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

be same across countries and time with αi Each NPI is assumed to have an independentmultiplicative as on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We use a weakly informative symmetric prior αi simN (002) allowingfor both positive and negative effects because we presently cannot rule out that some NPIsdirectly or indirectly increase transmission

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c andRt c 3

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We assume that the generationinterval distribution is given by a gamma distribution with mean 667 days and standarddeviation 21 The mean is based on an Italian study4 which is deemed most relevant toEuropean countries5 and the standard deviation stems from a international set of countriessince in European countries it has only been estimated for the serial rather than generationinterval to our knowledge67 Using (A2) we can write g t c as g t c (Rt c ) (see Appendix A2)

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are backwards-inferred from the observation models for casesand deaths shown further below We assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)](A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)](A4)

aMany epidemiological models define growth rates as the exponent r in an exponential growth functionHere we use daily growth rates instead for ease of exposition These choices are mathematically equivalentNote that we adapted equation (29) in Wallinga amp Lipsitch3 to account for our choice

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross validation as no reference is available for it We did not tune any otheraspect of the modelmdashinstead we use choices from Flaxman et al1 or the most relevantavailable sources We evaluate five different values (σN isin 00501020304) fitting themodel on 35 countries each time and evaluating on a fixed randomly chosen validationset of 6 countries We select σN = 02 to maximize the log-likelihood on the validation setCross validation ensures a more calibrated model less likely to produce overconfident andunstable estimates10 The final model with σN = 02 is then evaluated on data from 20 held-out days in all countries which were not used to adjust any aspect of the model (FigureB7) We find that different values for σN produce very similar effectiveness estimates butlarger values lead to greater uncertainty and robustness (Figure C12 and Sharma et al9)

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation This delaydistribution is the sum of two independent gamma distributions the incubation periodand the delay from onset of symptoms to confirmation We use previously published andagreeing empirical distributions from China and Italy411ndash13 which sum up to a mean delayof 1035 days Finally the observed cases Ct c follow a negative binomial noise distributionwith mean C t c and an inferred dispersion parameter following Flaxman et al1

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ)

where PD (delay) is the distribution of the delay from infection to death It is also the sumof two independent gamma distributions the aforementioned incubation period and thedelay from onset of symptoms to death1114 which sum up to a mean delay of 229 daysFinally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and the same inferred dispersion parameter used for observed cases

bSince we treat new infections as a continuous number its initial value can (and often should) be between0 and 1

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Single and combined models To construct models which only use either confirmed cases ordeaths as observations we remove the variables corresponding to the disregarded observa-tions

Testing reporting and infection fatality rates Scaling all values of a time series by a con-stant does not change its growth rates The model is therefore invariant to the scale of theobservations and consequently to country-level differences in the IFR and the ascertainmentrate (the proportion of infected people who are subsequently reported positive) For exam-ple assume countries A and B differ only in their ascertainment rates Then our model willinfer a difference in N (C )

t c (Eq (A5)) but not in the growth rates g t c across A and B (Eq(A3)-(A4)) Accordingly the inferred NPI effectiveness will be identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas theintroduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimic changes in the ascertainment ratemdashnoise at time τ

affects all future casesmdashand allows for gradual multiplicative changes in the ascertainmentrate

We infer the unobserved variables in our model using Hamiltonian Monte-Carlo1516 (HMC)a standard MCMC sampling algorithm

Appendix A2 Technical Model Description

Variables are indexed by intervention i country c and day t All prior distributions areindependent

bull Data1 NPI Activations φi t c isin 012 Smoothed Observed Cases Ct c 3 Smoothed Observed Deaths D t c

bull Prior Distributions

cThis is only approximately true The negative binomial output distribution has a coefficient of variationdiminishing with its mean ie smaller observations are relatively more noisy and carry less weight Further-more whilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligibleeffect

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

1 Country-specific R0

R0c = Normal(R0κ) (A6)

R0 = 325 based on a meta analysis17 (A7)

κsim Half Normal(micro= 0σ= 05) (A8)

2 NPI Effectiveness

αi sim Normal(micro= 0σ=p02) (A9)

(A10)

3 Infection Initial Counts (uninformative priors)

N (C )0c = exp(ζ(C )

c ) (A11)

N (D)0c = exp(ζ(D)

c ) (A12)

ζ(C )c sim Normal(micro= 0σ= 50) (A13)

ζ(D)c sim Normal(micro= 0σ= 50) (A14)

(A15)

4 Observation Noise Dispersion Parameter

Ψsim Half Normal(micro= 0σ= 5) (A16)

bull Hyperparameters1 Infection Noise Scale σN = 02 (selected by cross-validation)

bull Epidemiological parameters1 Generation Interval Parameters The generation interval is assumed to have a

Gamma distribution with mean 667 and standard deviation 21 days4ndash6 This leadsto a distribution Gamma(α= 79β= 12)

2 Delay Distributions The time from infection to confirmation is assumed to bethe sum of the incubation period and the time taken from symptom onset to lab-oratory confirmation Therefore the time taken from infection to confirmationT (C ) is411ndash13

T (C ) sim Gamma(micro= 51σ

micro= 086)+Negative Binomial(micro= 525α= 157) (A17)

The time from infection to death is assumed to be the sum of the incubationperiod and the time taken from symptom onset to death Therefore the timetaken from infection to death T (D) is11114

T (D) sim Gamma(micro= 51σ

micro= 086)+Gamma(micro= 178

σ

micro= 045) (A18)

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 2: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Results We model each NPIrsquos effect as a multiplicative (percentage) reduction in the repro-duction number R We estimate the mean reduction in R across the countries in our data foreight NPIs mandating mask-wearing in (some) public spaces (2 95 CI -14ndash16)limiting gatherings to 1000 people or less (2 -20ndash22) to 100 people or less (211ndash39) to 10 people or less (36 16ndash53) closing some high-risk businesses (3113ndash46) closing most nonessential businesses (40 22ndash55) closing schools anduniversities (39 21ndash55) and issuing stay-at-home orders (18 4ndash31) Theseresults are supported by extensive empirical validation including 15 sensitivity analyses

Conclusions Our results suggest that by implementing effective NPIs many countries canreduce R below 1 without issuing a stay-at-home order We find a surprisingly large rolefor school and university closures in reducing COVID-19 transmission a contribution to theongoing debate about the relevance of asymptomatic carriers in disease spread Banninggatherings and closing high-risk businesses can be highly effective in reducing transmissionbut closing most businesses only has limited additional benefit

2

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Introduction

Worldwide governments have mobilised vast resources to fight the COVID-19 pandemic Awide range of nonpharmaceutical interventions (NPIs) has been deployed including drasticmeasures like stay-at-home orders and the closure of all nonessential businesses Recentanalyses show that these large-scale NPIs are jointly effective at reducing the virusrsquo effectivereproduction number1 but the effects of individual NPIs are still largely unknown As timeprogresses and more data become available we can move beyond estimating the combinedeffect of a bundle of NPIs and begin to understand the effects of individual interventionsThis can help governments efficiently control the epidemic while removing less effectiveNPIs to ease the burden put on the population

A promising way to estimate NPI effectiveness is data-driven cross-country modelling in-ferring effectiveness by relating the NPIs implemented in different countries to the courseof the epidemic in these countries To disentangle the effects of individual NPIs we needto leverage data from multiple regions with diverse sets of interventions in place Withsome exceptions1ndash4 previous data-driven studies focus on single NPIs or single geographi-cal regions (Table F4) In contrast we evaluate the impact of eight NPIs on the epidemicrsquosgrowth in 34 European and 7 non-European countries To our knowledge this is the largestdata-driven study of NPI effects on COVID-19 transmission to date The data gathered ispublicly available

To isolate the effect of individual NPIs we also require sufficiently diverse data If all coun-tries implemented the same set of NPIs on the same day the individual effect of each NPIwould be unidentifiable However the COVID-19 response was far less coordinated coun-tries implemented different sets of NPIs at different times in different orders (Figure 1)

Even with diverse data from many countries estimating NPI effects remains a challengingtask First many components of a model such as epidemiological parameters and interac-tions between NPIs are only known with high uncertainty Two recent replication studiesdemonstrated that NPI effectiveness estimates can be highly sensitive to arbitrary modellingdecisions5 especially when based on insufficient data6 Second the data are retrospectiveand observational meaning that unobserved factors could confound the results Thirdlarge-scale public NPI datasets suffer from frequent inconsistencies7 and missing data8 Forthese reasons the data and the model should be carefully validated Insufficiently validatedresults should not be used to guide policy decisions We perform to our knowledge by farthe most extensive validation of any COVID-19 NPI effectiveness results to date a crucialbut largely absent or incomplete element of NPI effectiveness studies5

3

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Universities closedSchools closedMask-wearing

Most businesses closed Stay-at-home orderSome businesses closed

Figure 1 Timing of NPI implementations in early 2020 Crossed-out symbols signify when an NPI waslifted Detailed definitions of the NPIs are given in Table 1

4

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Methods

Dataset

We analyse the effects of NPIs (Table 1) in 41 countriesa (see Figure 1) We recorded NPIimplementations when the measures were implemented nationally or in most regions of acountry (affecting at least three fourths of the population) For each country the window ofanalysis starts on the 22nd of January and ends after the first NPI was lifted or on the 30thof May 2020 whichever was earlier The reason to end the analysis after the first majorreopeningb was to avoid a distribution shift For example when schools reopened it wasoften with safety measures such as smaller class sizes and distancing rules It is thereforeexpected that contact patterns in schools will have been different before school closurecompared to after reopening Modelling this difference explicitly is left for future workData on confirmed COVID-19 cases and deaths were taken from the Johns Hopkins CSSECOVID-19 Dataset9 The data used in this study including sources are available onlinehere

aThe countries were selected for their number of cumulative cases surpassing a minimum threshold (at thetime of modelling) the availability of reliable data on NPIs and the trustworthiness of their reporting of deathsFinally we excluded very large countries like China the US and Canada for ease of data collection as thesewould require more locally fine-grained data

bConcretely the window of analysis extended until three days after the first reopening for confirmed casesand 13 days after the first reopening for deaths These values correspond to the 5 quantile of the infection-to-confirmationdeath distributions ensuring that less than 5 of the new infections on the reopening day werestill observed in the window of analysis

5

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table 1 NPIs included in the study Appendix G details how edge cases in the data collection werehandled

NPI DescriptionMask-wearingmandatory in(some) publicspaces

A country has mandated mask usage in the public sometimes limitedto just some public spaces (which the government deems to have a highrisk of infection) For example some countries mandated mask-wearingin most or all indoor public spaces but not outdoors

Gatheringslimited to 1000people or less

A country has set a size limit on gatherings The limit is at most 1000people (often less) and gatherings above the maximum size are disal-lowed For example a ban on gatherings of 500 people or more wouldbe classified as ldquogatherings limited to 1000 or lessrdquo but a ban on gath-erings of 2000 people or more would not

Gatheringslimited to 100people or less

A country has set a size limit on gatherings The limit is at most 100people (often less)

Gatheringslimited to 10people or less

A country has set a size limit on gatherings The limit is at most 10people (often less)

Some businessesclosed

A country has specified a few kinds of customer-facing businesses thatare considered ldquohigh riskrdquo and need to suspend operations (black-list) Common examples are restaurants bars nightclubs cinemasand gyms By default businesses are not suspended

Most nonessentialbusinesses closed

A country has suspended the operations of many customer-facing busi-nesses By default customer-facing businesses are suspended unlessthey are designated as essential (whitelist)

Schools closed A country has closed most or all schoolsUniversitiesclosed

A country has closed most or all universities and higher education fa-cilities

Stay-at-homeorder (withexemptions)

An order for the general public to stay at home has been issued This ismandatory not just a recommendation Exemptions are usually grantedfor certain purposes (such as shopping exercise or going to work) ormore rarely for certain times of the day In practice a stay-at-homeorder was often accompanied by other NPIs such as businesses closuresHowever a stay-at-home order does not in principle entail these otherNPIs but only the (additional) order to generally stay at home exceptfor exemptions

6

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Data collection

We collected data on the start and end date of NPI implementations from the start of thepandemic until the 30th of May 2020 Before collecting the data we experimented withseveral public NPI datasets finding that they were not complete enough for our modellingand contained incorrect datesc By focusing on a smaller set of countries and NPIs than thesedatasets we were able to enforce strong quality controls We used independent doubleentry and manually compared our data to public datasets for cross-checking

First two authors independently researched each country and entered the NPI data into sep-arate spreadsheets The researchers manually researched the dates using internet searchesthere was no automatic component in the data gathering process The average time spentresearching each country per researcher was 15 hours

Second the researchers independently compared their entries to the following public datasetsand if there were conflicts visited all primary sources to resolve the conflict the EFGNPIdatabase10 the Oxford COVID-19 Government Response Tracker11 and the mask4all dataset12

Third each country and NPI was again independently entered by one to three paid con-tractors who were provided with a detailed description of the NPIs and asked to includeprimary sources with their data A researcher then resolved any conflicts between this dataand one (but not both) of the spreadsheets

Finally the two independent spreadsheets were combined and all conflicts resolved by aresearcher The final dataset contains primary sources (government websites andor mediaarticles) for each entry

Data Preprocessing

Data on cases and deaths are noisy Many countries preferentially report deaths and cases oncertain days of the week For example there are days with zero newly confirmed cases eventhough there had been several hundred reported cases on the previous day We thereforesmooth the data using a moving average (on a linear scale) over 2 days into the past andfuture When the case count is small a large fraction of cases may be imported from other

cWe evaluated the following datasetsbull Epidemic Forecasting Global NPI Database10

bull Oxford COVID-19 Government Response Tracker (OxCGRT)11

bull ACAPS COVID19 Government Measures DatasetNote that these datasets are under continuous development Many of the mistakes found will already have beencorrected We know from our own experience that data collection can be very challenging We have the fullestrespect for the people behind these datasets In this paper we focus on a more limited set of countries and NPIsthan these datasets contain allowing us to ensure higher data quality in this subset Given our experience withpublic datasets and our data collection we encourage fellow COVID-19 researchers to independently verify thequality of public data they use if feasible

7

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

countries and the testing regime may change rapidly To prevent this from biasing ourmodel we neglect case numbers before a country has reached 100 confirmed cases anddeath numbers before a country has reached 10 deaths We include all preprocessing stepsin our sensitivity analysis (Appendix C2)

Model

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure 2 Model Overview Purple nodes are observed or have a fixed distribution From bottom to topThe effectiveness of intervention i is represented by αi On each day t a countryrsquos daily reproductionnumber Rt c depends on the countryrsquos basic reproduction number R0c and the active NPIs The activeNPIs are encoded by Φi t c which is 1 if NPI i is active in country c at time t and 0 otherwise Rt c istransformed into the daily growth rate gt c which also depends on the generation interval The growthrate is used to compute the new infections N (C )

t c and N (D)t c that will later be registered as confirmed cases

Ct c and deaths Dt c respectively after a delay Our model uses both death and case data it splits allnodes above the daily growth rate gt c into separate branches for deaths and confirmed cases

Our model uses case and death data from each country to lsquobackwardsrsquo infer the number ofnew infections at each point in time which is itself used to infer the reproduction numbersNPI effects are then estimated by relating the daily reproduction numbers to the activeNPIs across all days and countries This relatively simple lsquodata-drivenrsquo approach allows usto sidestep assumptions about contact patterns and intensity infectiousness of different agegroups and so forth that are typically required in modelling studies Our semi-mechanisticBayesian hierarchical model is based on that of Flaxman et al1 extended to use both caseand death data This increases the amount of data from which we can extract NPI effectsreduces distinct biases of case and death reporting and reduces the bias of only including

8

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

countries that have many deaths Additionally as we do not aim to infer the total number ofCOVID-19 infections we do not assume a specific infection fatality rate (IFR) or ascertain-ment rate (rate of testing) We proceed by summarising the model (Figure 2) A detaileddescription is given in Appendix A Code is available online here

The growth of the epidemic is determined by the time- and country-specific reproductionnumber Rt c which depends on a) the (unobserved) basic reproduction number R0c givenno active NPIs and b) the active NPIs at time t R0c accounts for all time-invariant factorsthat affect transmission in country c such as differences in demographics population den-sity culture and health systems13 We assume that the effect of each NPI on Rt c is stableacross countries and time The effectiveness of NPI i is represented by a parameter αi over which we place a symmetric prior with mean zero allowing both positive and negativeeffects Following Flaxman et al and others1ndash3 each NPIrsquos effect on Rt c is assumed toindependently affect Rt c as a multiplicative factor

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (1)

where φi ct = 1 indicates that NPI i is active in country c on day t (φi ct = 0 otherwise) andI is the number of NPIs The multiplicative effect encodes the plausible assumption thatNPIs have a smaller absolute effect when Rt c is already low We discuss the meaning ofeffectiveness estimates given NPI interactions in the Results section

In the early phase of an epidemic the number of new daily infections grows exponentiallyDuring exponential growth there is a one-to-one correspondence between the daily growthrate and Rt c 14 The correspondence depends on the generation interval (the time betweensuccessive infections in a chain of transmission) which we assume to have a Gamma dis-tribution with mean 667 days11516 We model the daily new infection count separatelyfor confirmed cases and deaths representing those infections which are later reported andthose which are later fatal However both infection numbers are assumed to grow at thesame daily rate in expectation allowing the use of both data sources to estimate each αi The infection numbers translate into reported confirmed cases and deaths after a stochasticdelay which is assumed to be equal across countries The delay is the sum of two indepen-dent gamma distributions assumed to be equal across countries the incubation period andthe delay from onset of symptoms to confirmation We use previously published empiricaldistributions from China and Italy16ndash19 which mutually agree and give a mean infection-to-confirmation delay of 1035 days Similarly the infection-to-death delay is the sum of theincubation period and the (gamma distributed) delay from onset of symptoms to death1720

which sum up to a mean delay of 229 days Finally both the reported deaths and casesfollow a negative binomial noise distribution an inferred dispersion parameter as in relatedNPI models13

Using a Markov chain Monte Carlo (MCMC) sampling algorithm21 this model infers poste-rior distributions of each NPIrsquos effectiveness while accounting for cross-country variationsin testing reporting and fatality rates However it relies on the key assumptions that NPIs

9

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

have equal effects across countries and time and that changes in R are due to the observedNPIs To analyse the extent to which modelling choices affect the results our sensitivityanalysis includes all epidemiological parameters prior distributions and many of the struc-tural assumptions introduced above (Appendix B2 and Appendix C) MCMC convergencestatistics are given in Appendix C

Results

NPI Effectiveness

Our model enables us to estimate the individual effectiveness of each NPI expressed asa percentage reduction in R As in related work1ndash3 this percentage reduction is modelledas constant over countries and time and independent of the other implemented NPIs Inpractice however NPI effectiveness may depend on other implemented NPIs and localcircumstances Thus our effectiveness estimates ought to be interpreted as the effectivenessaveraged over the contexts in which the NPI was implemented in our data5 Our results thusgive the average NPI effectiveness across typical situations that the NPIs were implementedin Figure 3 (bottom left) visualizes which NPIs typically co-occurred aiding interpretation

The mean percentage reduction in R (with 95 credible interval) associated with each NPIis as follows (Figure 3) mandating mask-wearing in (some) public spaces 2 (-14ndash16) limiting gatherings to 1000 people or less 2 (-20ndash22) to 100 people or less21 (1ndash39) to 10 people or less 36 (16ndash53) closing some high-risk businesses31 (13ndash46) closing most nonessential businesses 40 (22ndash55) closing schoolsand universities 39 (21ndash55) and issuing stay-at-home orders 18 (4ndash31)

Some NPIs frequently co-occur ie are collinear However we are able to isolate the ef-fects of individual NPIs since the collinearity is imperfect and our dataset is large For everypair of NPIs we observe one of them without the other for 748 country-days on average(Appendix D3) The minimum number of country-days for any NPI pair is 143 (for lim-iting gatherings to 1000 or 100 attendees) Additionally under excessive collinearity andinsufficient data to overcome it individual effectiveness estimates are highly sensitive tovariations in the data and model parameters22 High sensitivity prevented Flaxman et al1

who had a smaller dataset from disentangling NPI effects6 Our estimates are substantiallyless sensitive (see below) Finally the posterior correlations between the effectiveness esti-mates are weak suggesting manageable collinearity (Appendix D4)

Although the correlations between the individual estimates are weak we should take theminto account when evaluating combined effects of NPIs For example if two NPIs fre-quently co-occur there may be more certainty about the combined effect than about thetwo individual effects Figure 4 shows the combined effectiveness of the sets of NPIs thatare most common in our data All NPIs together reduce R by 82 (79ndash85) Acrossour countries the mean R without any NPIs (ie R0) is 38 matching the mean result ofFlaxman et al1 (Table D2 reports R0 for all countries) Starting from this number the esti-

10

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-100 -50 0 50 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spaces

Gatherings limited to1000 people or less

Gatherings limited to100 people or less

Gatherings limited to10 people or less

Some businessesclosed

Most nonessentialbusinesses closed

Schools and universitiesclosed

Stay-at-home order(with exemptions)

北 便

i

j

Frequency i Active Given j Active

25

50

75

100

0 500 1000 1500 2000 2500 3000

Days

Total Days Active

Figure 3 Top NPI effects The Figure shows the average percentage reductions in R as observed inour data (or in terms of the model the posterior marginal distributions of 1minusexp(minusαi )) with median50 and 95 credible intervals A negative 1 reduction refers to a 1 increase in R Cumulativeeffects are shown for hierarchical NPIs (gathering bans and business closures) ie the result for Mostnonessential businesses closed shows the cumulative effect of two NPIs with separate parameters andsymbols - closing some (high-risk) businesses and additionally closing most remaining (non-high-riskbut nonessential) businesses given that some businesses are already closed Finally we show the jointeffect of closing both schools and universities because the dates of school and university closures nearlyperfectly coincide in our data and we cannot meaningfully isolate their individual effects (AppendixD2) Bottom Left Conditional activation matrix Cell values indicate the frequency that NPI i (x-axis)was active given that NPI j (y-axis) was active Eg schools were always closed whenever a stay-at-home order was active (bottom row third column from the right) but not vice versa Bottom RightTotal number of days each NPI was active across all countries

11

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearingMost businesses closed Schools Closed Universities Closed Stay-at-home order

Some businessesclosed

Figure 4 Combined NPI effectiveness for the most common sets of NPIs in our data by size of the NPI setShaded regions denote 50 and 95 credible intervals Left Maximum R0 that can be reduced to below1 for each set of NPIs Right Predicted R after implementation of each set of NPIs assuming R0 = 38Readers can interactively explore the effects of all sets of NPIs at httpepidemicforecastingorgcalc

mated R can be reduced below 1 by closing schools and universities high-risk businessesand limiting gathering sizes Readers can interactively explore the effects of sets of NPIsat httpepidemicforecastingorgcalc A CSV file containing the joint effectiveness of allpossible NPI combinations is available online here

Validation

We perform a range of experiments to study the robustness and calibration of our NPI effec-tiveness estimates (Appendix B with further experiments in Appendix C) We analyse howthe model extrapolates to unseen countries and periods and perform multiple sensitivityanalyses Amongst other things we analyse how results change if we vary epidemiologicalparameters or vary the data (using only deaths or confirmed cases as observations ex-cluding countries from the data) To investigate our key assumptions we show results forseveral alternative models (structural sensitivity) analyze the role of NPI timing and ex-amine possible confounding of our estimates by unobserved factors influencing R Figure 5summarises these analyses by showing how each NPIrsquos effectiveness is ranked comparedto other NPIs and how its rank is distributed across all experiment conditions The strongagreement across these many analyses increases our confidence in the results while alsoshowing that the precise effectiveness estimates come with additional uncertainty

Discussion

We use a data-driven approach to estimate the effects of eight nonpharmaceutical interven-tions on COVID-19 transmission in 41 countries All eight NPIs together reduce R by 82

12

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

12345678NPI Median Effectiveness Ranking

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Global NPI Rank Stability

25

50

75

100

Figure 5 Ranking of NPIs by median effectiveness across all sensitivity analyses (15 sensitivity analyseswith a total of 96 experiment conditions) In each analysis NPI effects were estimated under severaldifferent plausible variations of the model or the data (Appendix B and Appendix C) The colourindicates in which fraction of all experiment conditions an NPI occupied a given rank The sensitivityanalyses aggregated in this Figure are shown in Figures B8 B9 C11 C12 C13 and B7A

(79ndash85) This finding is in strong agreement with the joint effect estimated in elevencountries by Flaxman et al1 and contributes to the mounting evidence that NPIs can beeffective at mitigating and suppressing outbreaks of COVID-19 Furthermore our resultssuggest that some NPIs outperform others While the exact effectiveness estimates varymildly the qualitative conclusions discussed below are robust across 15 sensitivity analyses

Business closures and gathering bans both seem effective at reducing COVID-19 transmis-sion Closing only high-risk businesses (mean reduction in R 31) appears only somewhatless effective than closing most nonessential businesses (40) making it the more promis-ing policy option in some circumstances Limiting gatherings to 10 people or less (36) wasmore effective than limits up to 100 (21) or 1000 people (2) This may reflect that smallgatherings are common As previously discussed we estimate the average additional effecteach NPI had in the contexts where it was implemented When countries introduced stay-at-home orders they nearly always also banned gatherings and closed schools universitiesand nonessential businesses if they had not done so already Flaxman et al1 and Hsianget al4 add the effect of these distinct NPIs to the effectiveness of stay-at-home orders andaccordingly find a large effect In contrast we and Banholzer et al3 isolate the additionaleffect of ordering the population to stay at home and instead find a smaller effect (18) Atypical country can reduce R to below 1 without a stay-at-home order (Figure 4) providedother NPIs stay active

Mandating mask-wearing in various public spaces had a small positive effect on averagein the countries we studied (2) This does not rule out that mask-wearing has a largereffect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaces

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

wearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier223 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 and open questionsabout different types of masks play a larger role potentially reducing effectiveness Whileour results cast doubt on reports that mask-wearing is the main determinant shaping acountryrsquos epidemic23 the policy still seems promising given all available evidence due toits comparatively low economic and social costs Its effectiveness may increase as otherNPIs are lifted and public interactions return

We find a surprisingly large effect for school and university closures an average 39 re-duction in R This finding is remarkably robust across different model structures variationsin the data and epidemiological assumptions (Figure B7) It remains robust when control-ling for NPIs excluded from our study and the onset time of major government intervention(Figures B9 and C10) Since school and university closures almost perfectly coincide inthe countries we study an approach such as ours cannot distinguish their individual effects(Appendix D2) Furthermore it cannot distinguish direct and indirect effects such asforcing parents to stay at home or causing broader behaviour changes by increasing publicconcern

Previous evidence on school and university closures is mixed1326 Early data suggest thatchildren and young adults are equally susceptible to infection but have a notably lowerobserved case rate than older adultsmdashwhether this is due to school and university closuresremains unknown27ndash29 Although infected young people are often asymptomatic they ap-pear to shed similar amounts of virus as older people3031 and might therefore circulatethe infection to higher-risk demographics unknowingly This also limits our ability to detectlarge outbreaks in educational facilities which closed in nearly all countries before suchdetection was feasible (with exceptions32) As outbreaks detected in UK schools are rapidlyincreasing33 this topic merits further study

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset5

and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B2) However this cannot give final certainty Investigating the role of unob-served effects is an important topic for future investigations Third our results cannot beused without qualification to predict the effect of lifting NPIs For example closing schoolsand universities seems to have greatly reduced transmission but this does not mean that re-opening them will cause infections to soar Educational institutions can (and do) implementsafety measures such as reduced class sizes Further work is needed to analyse the effectsof reopenings our collected data may be instrumental Fourth while we included more

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

NPIs than previous work (Table F4) several promising NPIs were excluded For exampletesting tracing and case isolation may be an important part of a cost-effective epidemicresponse34 but were not included because it is difficult to obtain comprehensive data Wediscuss further limitations in Appendix E

Currently governments across the world are seeking to keep R below 1 while minimising thesocial and economic costs of their interventions We hope that our results can guide policydecisions on which restrictions to lift and which NPIs to implement in any potential secondwave of infections Additionally our results show which areas of public life are most inneed of restructuring so that they can continue despite the pandemic However our resultsshould not be seen as the final answer on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies experimental trialssimulations and clinical experience

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

We thank Jacob Lagerros for operational support and for introducing some of the authorsto each other We thank Maksym Balatsko Marek Pukaj and Tomaacuteš Witzany for developingthe interactive website

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier MA Hartwick LChindelevitch designed the model and modelling experiments M Sharma AB StephensonT Gavenciak J Salvatier performed and analysed the modelling experiments J KulveitT Gavenciak JM Brauner conceived the research S Mindermann T Besiroglu J KulveitJM Brauner did the literature search JM Brauner S Mindermann M Sharma G LeechT Besiroglu V Mikulik wrote the manuscript All authors read and gave feedback on themanuscript and approved the final manuscript JM Brauner S Mindermann and M Sharmacontributed equally Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

3 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

4 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

5 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

6 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

7 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

8 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

9 Johns Hopkins University Center for Systems Science and Engineering COVID-19 DataRepository by the Center for Systems Science and Engineering (CSSE) at Johns HopkinsUniversity httpsgithubcomCSSEGISandDataCOVID-19 2020

10 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

11 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

12 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

13 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

14 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

15 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

19 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

20 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

28 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

29 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

30 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

31 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

32 Arnaud Fontanet et al ldquoCluster of COVID-19 in northern France A retrospective closedcohort studyrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041820071134

33 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 26 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile895356Weekly_COVID19_Surveillance_Report_w26pdf2020

34 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 21

Appendix A1 Detailed model description 21

Appendix A2 Technical Model Description 24

Appendix B Validation 28

Appendix B1 Unseen data 28

Appendix B2 Sensitivity Analysis 28

Appendix B3 Robustness to unobserved effects 31

Appendix C Additional sensitivity analyses and validation 33

Appendix C1 Role of NPI timing 33

Appendix C2 Sensitivity to data preprocessing 34

Appendix C3 Sensitivity to additional epidemiological assumptions 35

Appendix C4 Additional country exclusions 37

Appendix C5 Validation using predictions in excluded countries 38

Appendix C6 MCMC stability results 43

Appendix C7 Posterior predictive distributions 44

Appendix D Additional results 45

Appendix D1 Estimated R0 by country 45

Appendix D2 The individual effects of school and university closures 46

Appendix D3 Collinearity 46

Appendix D4 Correlations between effectiveness estimates 47

Appendix E Additional discussion of assumptions and limitations 49

Appendix E1 Limitations of the data 49

Appendix E2 Model limitations 49

Appendix F Overview of previous work 50

Appendix G Handling edge cases in the data collection 53

Appendix H References 57

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure A6 Model Overview Purple nodes are observed or have a fixed distribution We describe thediagram from bottom to top Each NPIrsquos effectiveness is characterised by αi which is independent ofthe country On each day a countryrsquos daily reproduction number Rt c only depends on that countryrsquosbase reproduction number R0c and the active NPIs (Φi t c ) Rt c is transformed to the daily growth rategt c which is used to compute the new infections N (C )

t c and N (D)t c that will turn into confirmed cases

and deaths respectively Finally the number of new confirmed cases Ct c and deaths Dt c is computedby convolution of N (middot)

t c with the respective delay distributions The same model structure is used forconfirmed cases and deaths The model combines both observations it splits all nodes above the dailygrowth rate gt c into separate branches for deaths and cases

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1 Themain difference is that we model both confirmed cases and deaths allowing us to leveragesignificantly more data Furthermore we do not assume a specific infection fatality rate(IFR) since we do not aim to infer the total number of COVID-19 infections The end of thissection details further adaptations which allow us to minimize assumptions about testingreporting and the IFR A list of all technical details is given in Appendix A2

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior (and hyperprior) distribution over R0c reflecting the wide disagree-ment of regional estimates of R02 We parameterize the effectiveness of NPI i assumed to

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

be same across countries and time with αi Each NPI is assumed to have an independentmultiplicative as on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We use a weakly informative symmetric prior αi simN (002) allowingfor both positive and negative effects because we presently cannot rule out that some NPIsdirectly or indirectly increase transmission

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c andRt c 3

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We assume that the generationinterval distribution is given by a gamma distribution with mean 667 days and standarddeviation 21 The mean is based on an Italian study4 which is deemed most relevant toEuropean countries5 and the standard deviation stems from a international set of countriessince in European countries it has only been estimated for the serial rather than generationinterval to our knowledge67 Using (A2) we can write g t c as g t c (Rt c ) (see Appendix A2)

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are backwards-inferred from the observation models for casesand deaths shown further below We assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)](A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)](A4)

aMany epidemiological models define growth rates as the exponent r in an exponential growth functionHere we use daily growth rates instead for ease of exposition These choices are mathematically equivalentNote that we adapted equation (29) in Wallinga amp Lipsitch3 to account for our choice

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross validation as no reference is available for it We did not tune any otheraspect of the modelmdashinstead we use choices from Flaxman et al1 or the most relevantavailable sources We evaluate five different values (σN isin 00501020304) fitting themodel on 35 countries each time and evaluating on a fixed randomly chosen validationset of 6 countries We select σN = 02 to maximize the log-likelihood on the validation setCross validation ensures a more calibrated model less likely to produce overconfident andunstable estimates10 The final model with σN = 02 is then evaluated on data from 20 held-out days in all countries which were not used to adjust any aspect of the model (FigureB7) We find that different values for σN produce very similar effectiveness estimates butlarger values lead to greater uncertainty and robustness (Figure C12 and Sharma et al9)

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation This delaydistribution is the sum of two independent gamma distributions the incubation periodand the delay from onset of symptoms to confirmation We use previously published andagreeing empirical distributions from China and Italy411ndash13 which sum up to a mean delayof 1035 days Finally the observed cases Ct c follow a negative binomial noise distributionwith mean C t c and an inferred dispersion parameter following Flaxman et al1

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ)

where PD (delay) is the distribution of the delay from infection to death It is also the sumof two independent gamma distributions the aforementioned incubation period and thedelay from onset of symptoms to death1114 which sum up to a mean delay of 229 daysFinally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and the same inferred dispersion parameter used for observed cases

bSince we treat new infections as a continuous number its initial value can (and often should) be between0 and 1

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Single and combined models To construct models which only use either confirmed cases ordeaths as observations we remove the variables corresponding to the disregarded observa-tions

Testing reporting and infection fatality rates Scaling all values of a time series by a con-stant does not change its growth rates The model is therefore invariant to the scale of theobservations and consequently to country-level differences in the IFR and the ascertainmentrate (the proportion of infected people who are subsequently reported positive) For exam-ple assume countries A and B differ only in their ascertainment rates Then our model willinfer a difference in N (C )

t c (Eq (A5)) but not in the growth rates g t c across A and B (Eq(A3)-(A4)) Accordingly the inferred NPI effectiveness will be identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas theintroduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimic changes in the ascertainment ratemdashnoise at time τ

affects all future casesmdashand allows for gradual multiplicative changes in the ascertainmentrate

We infer the unobserved variables in our model using Hamiltonian Monte-Carlo1516 (HMC)a standard MCMC sampling algorithm

Appendix A2 Technical Model Description

Variables are indexed by intervention i country c and day t All prior distributions areindependent

bull Data1 NPI Activations φi t c isin 012 Smoothed Observed Cases Ct c 3 Smoothed Observed Deaths D t c

bull Prior Distributions

cThis is only approximately true The negative binomial output distribution has a coefficient of variationdiminishing with its mean ie smaller observations are relatively more noisy and carry less weight Further-more whilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligibleeffect

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

1 Country-specific R0

R0c = Normal(R0κ) (A6)

R0 = 325 based on a meta analysis17 (A7)

κsim Half Normal(micro= 0σ= 05) (A8)

2 NPI Effectiveness

αi sim Normal(micro= 0σ=p02) (A9)

(A10)

3 Infection Initial Counts (uninformative priors)

N (C )0c = exp(ζ(C )

c ) (A11)

N (D)0c = exp(ζ(D)

c ) (A12)

ζ(C )c sim Normal(micro= 0σ= 50) (A13)

ζ(D)c sim Normal(micro= 0σ= 50) (A14)

(A15)

4 Observation Noise Dispersion Parameter

Ψsim Half Normal(micro= 0σ= 5) (A16)

bull Hyperparameters1 Infection Noise Scale σN = 02 (selected by cross-validation)

bull Epidemiological parameters1 Generation Interval Parameters The generation interval is assumed to have a

Gamma distribution with mean 667 and standard deviation 21 days4ndash6 This leadsto a distribution Gamma(α= 79β= 12)

2 Delay Distributions The time from infection to confirmation is assumed to bethe sum of the incubation period and the time taken from symptom onset to lab-oratory confirmation Therefore the time taken from infection to confirmationT (C ) is411ndash13

T (C ) sim Gamma(micro= 51σ

micro= 086)+Negative Binomial(micro= 525α= 157) (A17)

The time from infection to death is assumed to be the sum of the incubationperiod and the time taken from symptom onset to death Therefore the timetaken from infection to death T (D) is11114

T (D) sim Gamma(micro= 51σ

micro= 086)+Gamma(micro= 178

σ

micro= 045) (A18)

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 3: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Introduction

Worldwide governments have mobilised vast resources to fight the COVID-19 pandemic Awide range of nonpharmaceutical interventions (NPIs) has been deployed including drasticmeasures like stay-at-home orders and the closure of all nonessential businesses Recentanalyses show that these large-scale NPIs are jointly effective at reducing the virusrsquo effectivereproduction number1 but the effects of individual NPIs are still largely unknown As timeprogresses and more data become available we can move beyond estimating the combinedeffect of a bundle of NPIs and begin to understand the effects of individual interventionsThis can help governments efficiently control the epidemic while removing less effectiveNPIs to ease the burden put on the population

A promising way to estimate NPI effectiveness is data-driven cross-country modelling in-ferring effectiveness by relating the NPIs implemented in different countries to the courseof the epidemic in these countries To disentangle the effects of individual NPIs we needto leverage data from multiple regions with diverse sets of interventions in place Withsome exceptions1ndash4 previous data-driven studies focus on single NPIs or single geographi-cal regions (Table F4) In contrast we evaluate the impact of eight NPIs on the epidemicrsquosgrowth in 34 European and 7 non-European countries To our knowledge this is the largestdata-driven study of NPI effects on COVID-19 transmission to date The data gathered ispublicly available

To isolate the effect of individual NPIs we also require sufficiently diverse data If all coun-tries implemented the same set of NPIs on the same day the individual effect of each NPIwould be unidentifiable However the COVID-19 response was far less coordinated coun-tries implemented different sets of NPIs at different times in different orders (Figure 1)

Even with diverse data from many countries estimating NPI effects remains a challengingtask First many components of a model such as epidemiological parameters and interac-tions between NPIs are only known with high uncertainty Two recent replication studiesdemonstrated that NPI effectiveness estimates can be highly sensitive to arbitrary modellingdecisions5 especially when based on insufficient data6 Second the data are retrospectiveand observational meaning that unobserved factors could confound the results Thirdlarge-scale public NPI datasets suffer from frequent inconsistencies7 and missing data8 Forthese reasons the data and the model should be carefully validated Insufficiently validatedresults should not be used to guide policy decisions We perform to our knowledge by farthe most extensive validation of any COVID-19 NPI effectiveness results to date a crucialbut largely absent or incomplete element of NPI effectiveness studies5

3

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Universities closedSchools closedMask-wearing

Most businesses closed Stay-at-home orderSome businesses closed

Figure 1 Timing of NPI implementations in early 2020 Crossed-out symbols signify when an NPI waslifted Detailed definitions of the NPIs are given in Table 1

4

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Methods

Dataset

We analyse the effects of NPIs (Table 1) in 41 countriesa (see Figure 1) We recorded NPIimplementations when the measures were implemented nationally or in most regions of acountry (affecting at least three fourths of the population) For each country the window ofanalysis starts on the 22nd of January and ends after the first NPI was lifted or on the 30thof May 2020 whichever was earlier The reason to end the analysis after the first majorreopeningb was to avoid a distribution shift For example when schools reopened it wasoften with safety measures such as smaller class sizes and distancing rules It is thereforeexpected that contact patterns in schools will have been different before school closurecompared to after reopening Modelling this difference explicitly is left for future workData on confirmed COVID-19 cases and deaths were taken from the Johns Hopkins CSSECOVID-19 Dataset9 The data used in this study including sources are available onlinehere

aThe countries were selected for their number of cumulative cases surpassing a minimum threshold (at thetime of modelling) the availability of reliable data on NPIs and the trustworthiness of their reporting of deathsFinally we excluded very large countries like China the US and Canada for ease of data collection as thesewould require more locally fine-grained data

bConcretely the window of analysis extended until three days after the first reopening for confirmed casesand 13 days after the first reopening for deaths These values correspond to the 5 quantile of the infection-to-confirmationdeath distributions ensuring that less than 5 of the new infections on the reopening day werestill observed in the window of analysis

5

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table 1 NPIs included in the study Appendix G details how edge cases in the data collection werehandled

NPI DescriptionMask-wearingmandatory in(some) publicspaces

A country has mandated mask usage in the public sometimes limitedto just some public spaces (which the government deems to have a highrisk of infection) For example some countries mandated mask-wearingin most or all indoor public spaces but not outdoors

Gatheringslimited to 1000people or less

A country has set a size limit on gatherings The limit is at most 1000people (often less) and gatherings above the maximum size are disal-lowed For example a ban on gatherings of 500 people or more wouldbe classified as ldquogatherings limited to 1000 or lessrdquo but a ban on gath-erings of 2000 people or more would not

Gatheringslimited to 100people or less

A country has set a size limit on gatherings The limit is at most 100people (often less)

Gatheringslimited to 10people or less

A country has set a size limit on gatherings The limit is at most 10people (often less)

Some businessesclosed

A country has specified a few kinds of customer-facing businesses thatare considered ldquohigh riskrdquo and need to suspend operations (black-list) Common examples are restaurants bars nightclubs cinemasand gyms By default businesses are not suspended

Most nonessentialbusinesses closed

A country has suspended the operations of many customer-facing busi-nesses By default customer-facing businesses are suspended unlessthey are designated as essential (whitelist)

Schools closed A country has closed most or all schoolsUniversitiesclosed

A country has closed most or all universities and higher education fa-cilities

Stay-at-homeorder (withexemptions)

An order for the general public to stay at home has been issued This ismandatory not just a recommendation Exemptions are usually grantedfor certain purposes (such as shopping exercise or going to work) ormore rarely for certain times of the day In practice a stay-at-homeorder was often accompanied by other NPIs such as businesses closuresHowever a stay-at-home order does not in principle entail these otherNPIs but only the (additional) order to generally stay at home exceptfor exemptions

6

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Data collection

We collected data on the start and end date of NPI implementations from the start of thepandemic until the 30th of May 2020 Before collecting the data we experimented withseveral public NPI datasets finding that they were not complete enough for our modellingand contained incorrect datesc By focusing on a smaller set of countries and NPIs than thesedatasets we were able to enforce strong quality controls We used independent doubleentry and manually compared our data to public datasets for cross-checking

First two authors independently researched each country and entered the NPI data into sep-arate spreadsheets The researchers manually researched the dates using internet searchesthere was no automatic component in the data gathering process The average time spentresearching each country per researcher was 15 hours

Second the researchers independently compared their entries to the following public datasetsand if there were conflicts visited all primary sources to resolve the conflict the EFGNPIdatabase10 the Oxford COVID-19 Government Response Tracker11 and the mask4all dataset12

Third each country and NPI was again independently entered by one to three paid con-tractors who were provided with a detailed description of the NPIs and asked to includeprimary sources with their data A researcher then resolved any conflicts between this dataand one (but not both) of the spreadsheets

Finally the two independent spreadsheets were combined and all conflicts resolved by aresearcher The final dataset contains primary sources (government websites andor mediaarticles) for each entry

Data Preprocessing

Data on cases and deaths are noisy Many countries preferentially report deaths and cases oncertain days of the week For example there are days with zero newly confirmed cases eventhough there had been several hundred reported cases on the previous day We thereforesmooth the data using a moving average (on a linear scale) over 2 days into the past andfuture When the case count is small a large fraction of cases may be imported from other

cWe evaluated the following datasetsbull Epidemic Forecasting Global NPI Database10

bull Oxford COVID-19 Government Response Tracker (OxCGRT)11

bull ACAPS COVID19 Government Measures DatasetNote that these datasets are under continuous development Many of the mistakes found will already have beencorrected We know from our own experience that data collection can be very challenging We have the fullestrespect for the people behind these datasets In this paper we focus on a more limited set of countries and NPIsthan these datasets contain allowing us to ensure higher data quality in this subset Given our experience withpublic datasets and our data collection we encourage fellow COVID-19 researchers to independently verify thequality of public data they use if feasible

7

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

countries and the testing regime may change rapidly To prevent this from biasing ourmodel we neglect case numbers before a country has reached 100 confirmed cases anddeath numbers before a country has reached 10 deaths We include all preprocessing stepsin our sensitivity analysis (Appendix C2)

Model

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure 2 Model Overview Purple nodes are observed or have a fixed distribution From bottom to topThe effectiveness of intervention i is represented by αi On each day t a countryrsquos daily reproductionnumber Rt c depends on the countryrsquos basic reproduction number R0c and the active NPIs The activeNPIs are encoded by Φi t c which is 1 if NPI i is active in country c at time t and 0 otherwise Rt c istransformed into the daily growth rate gt c which also depends on the generation interval The growthrate is used to compute the new infections N (C )

t c and N (D)t c that will later be registered as confirmed cases

Ct c and deaths Dt c respectively after a delay Our model uses both death and case data it splits allnodes above the daily growth rate gt c into separate branches for deaths and confirmed cases

Our model uses case and death data from each country to lsquobackwardsrsquo infer the number ofnew infections at each point in time which is itself used to infer the reproduction numbersNPI effects are then estimated by relating the daily reproduction numbers to the activeNPIs across all days and countries This relatively simple lsquodata-drivenrsquo approach allows usto sidestep assumptions about contact patterns and intensity infectiousness of different agegroups and so forth that are typically required in modelling studies Our semi-mechanisticBayesian hierarchical model is based on that of Flaxman et al1 extended to use both caseand death data This increases the amount of data from which we can extract NPI effectsreduces distinct biases of case and death reporting and reduces the bias of only including

8

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

countries that have many deaths Additionally as we do not aim to infer the total number ofCOVID-19 infections we do not assume a specific infection fatality rate (IFR) or ascertain-ment rate (rate of testing) We proceed by summarising the model (Figure 2) A detaileddescription is given in Appendix A Code is available online here

The growth of the epidemic is determined by the time- and country-specific reproductionnumber Rt c which depends on a) the (unobserved) basic reproduction number R0c givenno active NPIs and b) the active NPIs at time t R0c accounts for all time-invariant factorsthat affect transmission in country c such as differences in demographics population den-sity culture and health systems13 We assume that the effect of each NPI on Rt c is stableacross countries and time The effectiveness of NPI i is represented by a parameter αi over which we place a symmetric prior with mean zero allowing both positive and negativeeffects Following Flaxman et al and others1ndash3 each NPIrsquos effect on Rt c is assumed toindependently affect Rt c as a multiplicative factor

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (1)

where φi ct = 1 indicates that NPI i is active in country c on day t (φi ct = 0 otherwise) andI is the number of NPIs The multiplicative effect encodes the plausible assumption thatNPIs have a smaller absolute effect when Rt c is already low We discuss the meaning ofeffectiveness estimates given NPI interactions in the Results section

In the early phase of an epidemic the number of new daily infections grows exponentiallyDuring exponential growth there is a one-to-one correspondence between the daily growthrate and Rt c 14 The correspondence depends on the generation interval (the time betweensuccessive infections in a chain of transmission) which we assume to have a Gamma dis-tribution with mean 667 days11516 We model the daily new infection count separatelyfor confirmed cases and deaths representing those infections which are later reported andthose which are later fatal However both infection numbers are assumed to grow at thesame daily rate in expectation allowing the use of both data sources to estimate each αi The infection numbers translate into reported confirmed cases and deaths after a stochasticdelay which is assumed to be equal across countries The delay is the sum of two indepen-dent gamma distributions assumed to be equal across countries the incubation period andthe delay from onset of symptoms to confirmation We use previously published empiricaldistributions from China and Italy16ndash19 which mutually agree and give a mean infection-to-confirmation delay of 1035 days Similarly the infection-to-death delay is the sum of theincubation period and the (gamma distributed) delay from onset of symptoms to death1720

which sum up to a mean delay of 229 days Finally both the reported deaths and casesfollow a negative binomial noise distribution an inferred dispersion parameter as in relatedNPI models13

Using a Markov chain Monte Carlo (MCMC) sampling algorithm21 this model infers poste-rior distributions of each NPIrsquos effectiveness while accounting for cross-country variationsin testing reporting and fatality rates However it relies on the key assumptions that NPIs

9

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

have equal effects across countries and time and that changes in R are due to the observedNPIs To analyse the extent to which modelling choices affect the results our sensitivityanalysis includes all epidemiological parameters prior distributions and many of the struc-tural assumptions introduced above (Appendix B2 and Appendix C) MCMC convergencestatistics are given in Appendix C

Results

NPI Effectiveness

Our model enables us to estimate the individual effectiveness of each NPI expressed asa percentage reduction in R As in related work1ndash3 this percentage reduction is modelledas constant over countries and time and independent of the other implemented NPIs Inpractice however NPI effectiveness may depend on other implemented NPIs and localcircumstances Thus our effectiveness estimates ought to be interpreted as the effectivenessaveraged over the contexts in which the NPI was implemented in our data5 Our results thusgive the average NPI effectiveness across typical situations that the NPIs were implementedin Figure 3 (bottom left) visualizes which NPIs typically co-occurred aiding interpretation

The mean percentage reduction in R (with 95 credible interval) associated with each NPIis as follows (Figure 3) mandating mask-wearing in (some) public spaces 2 (-14ndash16) limiting gatherings to 1000 people or less 2 (-20ndash22) to 100 people or less21 (1ndash39) to 10 people or less 36 (16ndash53) closing some high-risk businesses31 (13ndash46) closing most nonessential businesses 40 (22ndash55) closing schoolsand universities 39 (21ndash55) and issuing stay-at-home orders 18 (4ndash31)

Some NPIs frequently co-occur ie are collinear However we are able to isolate the ef-fects of individual NPIs since the collinearity is imperfect and our dataset is large For everypair of NPIs we observe one of them without the other for 748 country-days on average(Appendix D3) The minimum number of country-days for any NPI pair is 143 (for lim-iting gatherings to 1000 or 100 attendees) Additionally under excessive collinearity andinsufficient data to overcome it individual effectiveness estimates are highly sensitive tovariations in the data and model parameters22 High sensitivity prevented Flaxman et al1

who had a smaller dataset from disentangling NPI effects6 Our estimates are substantiallyless sensitive (see below) Finally the posterior correlations between the effectiveness esti-mates are weak suggesting manageable collinearity (Appendix D4)

Although the correlations between the individual estimates are weak we should take theminto account when evaluating combined effects of NPIs For example if two NPIs fre-quently co-occur there may be more certainty about the combined effect than about thetwo individual effects Figure 4 shows the combined effectiveness of the sets of NPIs thatare most common in our data All NPIs together reduce R by 82 (79ndash85) Acrossour countries the mean R without any NPIs (ie R0) is 38 matching the mean result ofFlaxman et al1 (Table D2 reports R0 for all countries) Starting from this number the esti-

10

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-100 -50 0 50 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spaces

Gatherings limited to1000 people or less

Gatherings limited to100 people or less

Gatherings limited to10 people or less

Some businessesclosed

Most nonessentialbusinesses closed

Schools and universitiesclosed

Stay-at-home order(with exemptions)

北 便

i

j

Frequency i Active Given j Active

25

50

75

100

0 500 1000 1500 2000 2500 3000

Days

Total Days Active

Figure 3 Top NPI effects The Figure shows the average percentage reductions in R as observed inour data (or in terms of the model the posterior marginal distributions of 1minusexp(minusαi )) with median50 and 95 credible intervals A negative 1 reduction refers to a 1 increase in R Cumulativeeffects are shown for hierarchical NPIs (gathering bans and business closures) ie the result for Mostnonessential businesses closed shows the cumulative effect of two NPIs with separate parameters andsymbols - closing some (high-risk) businesses and additionally closing most remaining (non-high-riskbut nonessential) businesses given that some businesses are already closed Finally we show the jointeffect of closing both schools and universities because the dates of school and university closures nearlyperfectly coincide in our data and we cannot meaningfully isolate their individual effects (AppendixD2) Bottom Left Conditional activation matrix Cell values indicate the frequency that NPI i (x-axis)was active given that NPI j (y-axis) was active Eg schools were always closed whenever a stay-at-home order was active (bottom row third column from the right) but not vice versa Bottom RightTotal number of days each NPI was active across all countries

11

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearingMost businesses closed Schools Closed Universities Closed Stay-at-home order

Some businessesclosed

Figure 4 Combined NPI effectiveness for the most common sets of NPIs in our data by size of the NPI setShaded regions denote 50 and 95 credible intervals Left Maximum R0 that can be reduced to below1 for each set of NPIs Right Predicted R after implementation of each set of NPIs assuming R0 = 38Readers can interactively explore the effects of all sets of NPIs at httpepidemicforecastingorgcalc

mated R can be reduced below 1 by closing schools and universities high-risk businessesand limiting gathering sizes Readers can interactively explore the effects of sets of NPIsat httpepidemicforecastingorgcalc A CSV file containing the joint effectiveness of allpossible NPI combinations is available online here

Validation

We perform a range of experiments to study the robustness and calibration of our NPI effec-tiveness estimates (Appendix B with further experiments in Appendix C) We analyse howthe model extrapolates to unseen countries and periods and perform multiple sensitivityanalyses Amongst other things we analyse how results change if we vary epidemiologicalparameters or vary the data (using only deaths or confirmed cases as observations ex-cluding countries from the data) To investigate our key assumptions we show results forseveral alternative models (structural sensitivity) analyze the role of NPI timing and ex-amine possible confounding of our estimates by unobserved factors influencing R Figure 5summarises these analyses by showing how each NPIrsquos effectiveness is ranked comparedto other NPIs and how its rank is distributed across all experiment conditions The strongagreement across these many analyses increases our confidence in the results while alsoshowing that the precise effectiveness estimates come with additional uncertainty

Discussion

We use a data-driven approach to estimate the effects of eight nonpharmaceutical interven-tions on COVID-19 transmission in 41 countries All eight NPIs together reduce R by 82

12

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

12345678NPI Median Effectiveness Ranking

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Global NPI Rank Stability

25

50

75

100

Figure 5 Ranking of NPIs by median effectiveness across all sensitivity analyses (15 sensitivity analyseswith a total of 96 experiment conditions) In each analysis NPI effects were estimated under severaldifferent plausible variations of the model or the data (Appendix B and Appendix C) The colourindicates in which fraction of all experiment conditions an NPI occupied a given rank The sensitivityanalyses aggregated in this Figure are shown in Figures B8 B9 C11 C12 C13 and B7A

(79ndash85) This finding is in strong agreement with the joint effect estimated in elevencountries by Flaxman et al1 and contributes to the mounting evidence that NPIs can beeffective at mitigating and suppressing outbreaks of COVID-19 Furthermore our resultssuggest that some NPIs outperform others While the exact effectiveness estimates varymildly the qualitative conclusions discussed below are robust across 15 sensitivity analyses

Business closures and gathering bans both seem effective at reducing COVID-19 transmis-sion Closing only high-risk businesses (mean reduction in R 31) appears only somewhatless effective than closing most nonessential businesses (40) making it the more promis-ing policy option in some circumstances Limiting gatherings to 10 people or less (36) wasmore effective than limits up to 100 (21) or 1000 people (2) This may reflect that smallgatherings are common As previously discussed we estimate the average additional effecteach NPI had in the contexts where it was implemented When countries introduced stay-at-home orders they nearly always also banned gatherings and closed schools universitiesand nonessential businesses if they had not done so already Flaxman et al1 and Hsianget al4 add the effect of these distinct NPIs to the effectiveness of stay-at-home orders andaccordingly find a large effect In contrast we and Banholzer et al3 isolate the additionaleffect of ordering the population to stay at home and instead find a smaller effect (18) Atypical country can reduce R to below 1 without a stay-at-home order (Figure 4) providedother NPIs stay active

Mandating mask-wearing in various public spaces had a small positive effect on averagein the countries we studied (2) This does not rule out that mask-wearing has a largereffect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaces

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

wearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier223 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 and open questionsabout different types of masks play a larger role potentially reducing effectiveness Whileour results cast doubt on reports that mask-wearing is the main determinant shaping acountryrsquos epidemic23 the policy still seems promising given all available evidence due toits comparatively low economic and social costs Its effectiveness may increase as otherNPIs are lifted and public interactions return

We find a surprisingly large effect for school and university closures an average 39 re-duction in R This finding is remarkably robust across different model structures variationsin the data and epidemiological assumptions (Figure B7) It remains robust when control-ling for NPIs excluded from our study and the onset time of major government intervention(Figures B9 and C10) Since school and university closures almost perfectly coincide inthe countries we study an approach such as ours cannot distinguish their individual effects(Appendix D2) Furthermore it cannot distinguish direct and indirect effects such asforcing parents to stay at home or causing broader behaviour changes by increasing publicconcern

Previous evidence on school and university closures is mixed1326 Early data suggest thatchildren and young adults are equally susceptible to infection but have a notably lowerobserved case rate than older adultsmdashwhether this is due to school and university closuresremains unknown27ndash29 Although infected young people are often asymptomatic they ap-pear to shed similar amounts of virus as older people3031 and might therefore circulatethe infection to higher-risk demographics unknowingly This also limits our ability to detectlarge outbreaks in educational facilities which closed in nearly all countries before suchdetection was feasible (with exceptions32) As outbreaks detected in UK schools are rapidlyincreasing33 this topic merits further study

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset5

and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B2) However this cannot give final certainty Investigating the role of unob-served effects is an important topic for future investigations Third our results cannot beused without qualification to predict the effect of lifting NPIs For example closing schoolsand universities seems to have greatly reduced transmission but this does not mean that re-opening them will cause infections to soar Educational institutions can (and do) implementsafety measures such as reduced class sizes Further work is needed to analyse the effectsof reopenings our collected data may be instrumental Fourth while we included more

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

NPIs than previous work (Table F4) several promising NPIs were excluded For exampletesting tracing and case isolation may be an important part of a cost-effective epidemicresponse34 but were not included because it is difficult to obtain comprehensive data Wediscuss further limitations in Appendix E

Currently governments across the world are seeking to keep R below 1 while minimising thesocial and economic costs of their interventions We hope that our results can guide policydecisions on which restrictions to lift and which NPIs to implement in any potential secondwave of infections Additionally our results show which areas of public life are most inneed of restructuring so that they can continue despite the pandemic However our resultsshould not be seen as the final answer on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies experimental trialssimulations and clinical experience

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

We thank Jacob Lagerros for operational support and for introducing some of the authorsto each other We thank Maksym Balatsko Marek Pukaj and Tomaacuteš Witzany for developingthe interactive website

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier MA Hartwick LChindelevitch designed the model and modelling experiments M Sharma AB StephensonT Gavenciak J Salvatier performed and analysed the modelling experiments J KulveitT Gavenciak JM Brauner conceived the research S Mindermann T Besiroglu J KulveitJM Brauner did the literature search JM Brauner S Mindermann M Sharma G LeechT Besiroglu V Mikulik wrote the manuscript All authors read and gave feedback on themanuscript and approved the final manuscript JM Brauner S Mindermann and M Sharmacontributed equally Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

3 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

4 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

5 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

6 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

7 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

8 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

9 Johns Hopkins University Center for Systems Science and Engineering COVID-19 DataRepository by the Center for Systems Science and Engineering (CSSE) at Johns HopkinsUniversity httpsgithubcomCSSEGISandDataCOVID-19 2020

10 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

11 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

12 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

13 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

14 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

15 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

19 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

20 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

28 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

29 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

30 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

31 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

32 Arnaud Fontanet et al ldquoCluster of COVID-19 in northern France A retrospective closedcohort studyrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041820071134

33 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 26 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile895356Weekly_COVID19_Surveillance_Report_w26pdf2020

34 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 21

Appendix A1 Detailed model description 21

Appendix A2 Technical Model Description 24

Appendix B Validation 28

Appendix B1 Unseen data 28

Appendix B2 Sensitivity Analysis 28

Appendix B3 Robustness to unobserved effects 31

Appendix C Additional sensitivity analyses and validation 33

Appendix C1 Role of NPI timing 33

Appendix C2 Sensitivity to data preprocessing 34

Appendix C3 Sensitivity to additional epidemiological assumptions 35

Appendix C4 Additional country exclusions 37

Appendix C5 Validation using predictions in excluded countries 38

Appendix C6 MCMC stability results 43

Appendix C7 Posterior predictive distributions 44

Appendix D Additional results 45

Appendix D1 Estimated R0 by country 45

Appendix D2 The individual effects of school and university closures 46

Appendix D3 Collinearity 46

Appendix D4 Correlations between effectiveness estimates 47

Appendix E Additional discussion of assumptions and limitations 49

Appendix E1 Limitations of the data 49

Appendix E2 Model limitations 49

Appendix F Overview of previous work 50

Appendix G Handling edge cases in the data collection 53

Appendix H References 57

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure A6 Model Overview Purple nodes are observed or have a fixed distribution We describe thediagram from bottom to top Each NPIrsquos effectiveness is characterised by αi which is independent ofthe country On each day a countryrsquos daily reproduction number Rt c only depends on that countryrsquosbase reproduction number R0c and the active NPIs (Φi t c ) Rt c is transformed to the daily growth rategt c which is used to compute the new infections N (C )

t c and N (D)t c that will turn into confirmed cases

and deaths respectively Finally the number of new confirmed cases Ct c and deaths Dt c is computedby convolution of N (middot)

t c with the respective delay distributions The same model structure is used forconfirmed cases and deaths The model combines both observations it splits all nodes above the dailygrowth rate gt c into separate branches for deaths and cases

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1 Themain difference is that we model both confirmed cases and deaths allowing us to leveragesignificantly more data Furthermore we do not assume a specific infection fatality rate(IFR) since we do not aim to infer the total number of COVID-19 infections The end of thissection details further adaptations which allow us to minimize assumptions about testingreporting and the IFR A list of all technical details is given in Appendix A2

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior (and hyperprior) distribution over R0c reflecting the wide disagree-ment of regional estimates of R02 We parameterize the effectiveness of NPI i assumed to

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

be same across countries and time with αi Each NPI is assumed to have an independentmultiplicative as on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We use a weakly informative symmetric prior αi simN (002) allowingfor both positive and negative effects because we presently cannot rule out that some NPIsdirectly or indirectly increase transmission

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c andRt c 3

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We assume that the generationinterval distribution is given by a gamma distribution with mean 667 days and standarddeviation 21 The mean is based on an Italian study4 which is deemed most relevant toEuropean countries5 and the standard deviation stems from a international set of countriessince in European countries it has only been estimated for the serial rather than generationinterval to our knowledge67 Using (A2) we can write g t c as g t c (Rt c ) (see Appendix A2)

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are backwards-inferred from the observation models for casesand deaths shown further below We assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)](A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)](A4)

aMany epidemiological models define growth rates as the exponent r in an exponential growth functionHere we use daily growth rates instead for ease of exposition These choices are mathematically equivalentNote that we adapted equation (29) in Wallinga amp Lipsitch3 to account for our choice

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross validation as no reference is available for it We did not tune any otheraspect of the modelmdashinstead we use choices from Flaxman et al1 or the most relevantavailable sources We evaluate five different values (σN isin 00501020304) fitting themodel on 35 countries each time and evaluating on a fixed randomly chosen validationset of 6 countries We select σN = 02 to maximize the log-likelihood on the validation setCross validation ensures a more calibrated model less likely to produce overconfident andunstable estimates10 The final model with σN = 02 is then evaluated on data from 20 held-out days in all countries which were not used to adjust any aspect of the model (FigureB7) We find that different values for σN produce very similar effectiveness estimates butlarger values lead to greater uncertainty and robustness (Figure C12 and Sharma et al9)

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation This delaydistribution is the sum of two independent gamma distributions the incubation periodand the delay from onset of symptoms to confirmation We use previously published andagreeing empirical distributions from China and Italy411ndash13 which sum up to a mean delayof 1035 days Finally the observed cases Ct c follow a negative binomial noise distributionwith mean C t c and an inferred dispersion parameter following Flaxman et al1

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ)

where PD (delay) is the distribution of the delay from infection to death It is also the sumof two independent gamma distributions the aforementioned incubation period and thedelay from onset of symptoms to death1114 which sum up to a mean delay of 229 daysFinally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and the same inferred dispersion parameter used for observed cases

bSince we treat new infections as a continuous number its initial value can (and often should) be between0 and 1

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Single and combined models To construct models which only use either confirmed cases ordeaths as observations we remove the variables corresponding to the disregarded observa-tions

Testing reporting and infection fatality rates Scaling all values of a time series by a con-stant does not change its growth rates The model is therefore invariant to the scale of theobservations and consequently to country-level differences in the IFR and the ascertainmentrate (the proportion of infected people who are subsequently reported positive) For exam-ple assume countries A and B differ only in their ascertainment rates Then our model willinfer a difference in N (C )

t c (Eq (A5)) but not in the growth rates g t c across A and B (Eq(A3)-(A4)) Accordingly the inferred NPI effectiveness will be identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas theintroduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimic changes in the ascertainment ratemdashnoise at time τ

affects all future casesmdashand allows for gradual multiplicative changes in the ascertainmentrate

We infer the unobserved variables in our model using Hamiltonian Monte-Carlo1516 (HMC)a standard MCMC sampling algorithm

Appendix A2 Technical Model Description

Variables are indexed by intervention i country c and day t All prior distributions areindependent

bull Data1 NPI Activations φi t c isin 012 Smoothed Observed Cases Ct c 3 Smoothed Observed Deaths D t c

bull Prior Distributions

cThis is only approximately true The negative binomial output distribution has a coefficient of variationdiminishing with its mean ie smaller observations are relatively more noisy and carry less weight Further-more whilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligibleeffect

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

1 Country-specific R0

R0c = Normal(R0κ) (A6)

R0 = 325 based on a meta analysis17 (A7)

κsim Half Normal(micro= 0σ= 05) (A8)

2 NPI Effectiveness

αi sim Normal(micro= 0σ=p02) (A9)

(A10)

3 Infection Initial Counts (uninformative priors)

N (C )0c = exp(ζ(C )

c ) (A11)

N (D)0c = exp(ζ(D)

c ) (A12)

ζ(C )c sim Normal(micro= 0σ= 50) (A13)

ζ(D)c sim Normal(micro= 0σ= 50) (A14)

(A15)

4 Observation Noise Dispersion Parameter

Ψsim Half Normal(micro= 0σ= 5) (A16)

bull Hyperparameters1 Infection Noise Scale σN = 02 (selected by cross-validation)

bull Epidemiological parameters1 Generation Interval Parameters The generation interval is assumed to have a

Gamma distribution with mean 667 and standard deviation 21 days4ndash6 This leadsto a distribution Gamma(α= 79β= 12)

2 Delay Distributions The time from infection to confirmation is assumed to bethe sum of the incubation period and the time taken from symptom onset to lab-oratory confirmation Therefore the time taken from infection to confirmationT (C ) is411ndash13

T (C ) sim Gamma(micro= 51σ

micro= 086)+Negative Binomial(micro= 525α= 157) (A17)

The time from infection to death is assumed to be the sum of the incubationperiod and the time taken from symptom onset to death Therefore the timetaken from infection to death T (D) is11114

T (D) sim Gamma(micro= 51σ

micro= 086)+Gamma(micro= 178

σ

micro= 045) (A18)

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 4: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Universities closedSchools closedMask-wearing

Most businesses closed Stay-at-home orderSome businesses closed

Figure 1 Timing of NPI implementations in early 2020 Crossed-out symbols signify when an NPI waslifted Detailed definitions of the NPIs are given in Table 1

4

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Methods

Dataset

We analyse the effects of NPIs (Table 1) in 41 countriesa (see Figure 1) We recorded NPIimplementations when the measures were implemented nationally or in most regions of acountry (affecting at least three fourths of the population) For each country the window ofanalysis starts on the 22nd of January and ends after the first NPI was lifted or on the 30thof May 2020 whichever was earlier The reason to end the analysis after the first majorreopeningb was to avoid a distribution shift For example when schools reopened it wasoften with safety measures such as smaller class sizes and distancing rules It is thereforeexpected that contact patterns in schools will have been different before school closurecompared to after reopening Modelling this difference explicitly is left for future workData on confirmed COVID-19 cases and deaths were taken from the Johns Hopkins CSSECOVID-19 Dataset9 The data used in this study including sources are available onlinehere

aThe countries were selected for their number of cumulative cases surpassing a minimum threshold (at thetime of modelling) the availability of reliable data on NPIs and the trustworthiness of their reporting of deathsFinally we excluded very large countries like China the US and Canada for ease of data collection as thesewould require more locally fine-grained data

bConcretely the window of analysis extended until three days after the first reopening for confirmed casesand 13 days after the first reopening for deaths These values correspond to the 5 quantile of the infection-to-confirmationdeath distributions ensuring that less than 5 of the new infections on the reopening day werestill observed in the window of analysis

5

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table 1 NPIs included in the study Appendix G details how edge cases in the data collection werehandled

NPI DescriptionMask-wearingmandatory in(some) publicspaces

A country has mandated mask usage in the public sometimes limitedto just some public spaces (which the government deems to have a highrisk of infection) For example some countries mandated mask-wearingin most or all indoor public spaces but not outdoors

Gatheringslimited to 1000people or less

A country has set a size limit on gatherings The limit is at most 1000people (often less) and gatherings above the maximum size are disal-lowed For example a ban on gatherings of 500 people or more wouldbe classified as ldquogatherings limited to 1000 or lessrdquo but a ban on gath-erings of 2000 people or more would not

Gatheringslimited to 100people or less

A country has set a size limit on gatherings The limit is at most 100people (often less)

Gatheringslimited to 10people or less

A country has set a size limit on gatherings The limit is at most 10people (often less)

Some businessesclosed

A country has specified a few kinds of customer-facing businesses thatare considered ldquohigh riskrdquo and need to suspend operations (black-list) Common examples are restaurants bars nightclubs cinemasand gyms By default businesses are not suspended

Most nonessentialbusinesses closed

A country has suspended the operations of many customer-facing busi-nesses By default customer-facing businesses are suspended unlessthey are designated as essential (whitelist)

Schools closed A country has closed most or all schoolsUniversitiesclosed

A country has closed most or all universities and higher education fa-cilities

Stay-at-homeorder (withexemptions)

An order for the general public to stay at home has been issued This ismandatory not just a recommendation Exemptions are usually grantedfor certain purposes (such as shopping exercise or going to work) ormore rarely for certain times of the day In practice a stay-at-homeorder was often accompanied by other NPIs such as businesses closuresHowever a stay-at-home order does not in principle entail these otherNPIs but only the (additional) order to generally stay at home exceptfor exemptions

6

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Data collection

We collected data on the start and end date of NPI implementations from the start of thepandemic until the 30th of May 2020 Before collecting the data we experimented withseveral public NPI datasets finding that they were not complete enough for our modellingand contained incorrect datesc By focusing on a smaller set of countries and NPIs than thesedatasets we were able to enforce strong quality controls We used independent doubleentry and manually compared our data to public datasets for cross-checking

First two authors independently researched each country and entered the NPI data into sep-arate spreadsheets The researchers manually researched the dates using internet searchesthere was no automatic component in the data gathering process The average time spentresearching each country per researcher was 15 hours

Second the researchers independently compared their entries to the following public datasetsand if there were conflicts visited all primary sources to resolve the conflict the EFGNPIdatabase10 the Oxford COVID-19 Government Response Tracker11 and the mask4all dataset12

Third each country and NPI was again independently entered by one to three paid con-tractors who were provided with a detailed description of the NPIs and asked to includeprimary sources with their data A researcher then resolved any conflicts between this dataand one (but not both) of the spreadsheets

Finally the two independent spreadsheets were combined and all conflicts resolved by aresearcher The final dataset contains primary sources (government websites andor mediaarticles) for each entry

Data Preprocessing

Data on cases and deaths are noisy Many countries preferentially report deaths and cases oncertain days of the week For example there are days with zero newly confirmed cases eventhough there had been several hundred reported cases on the previous day We thereforesmooth the data using a moving average (on a linear scale) over 2 days into the past andfuture When the case count is small a large fraction of cases may be imported from other

cWe evaluated the following datasetsbull Epidemic Forecasting Global NPI Database10

bull Oxford COVID-19 Government Response Tracker (OxCGRT)11

bull ACAPS COVID19 Government Measures DatasetNote that these datasets are under continuous development Many of the mistakes found will already have beencorrected We know from our own experience that data collection can be very challenging We have the fullestrespect for the people behind these datasets In this paper we focus on a more limited set of countries and NPIsthan these datasets contain allowing us to ensure higher data quality in this subset Given our experience withpublic datasets and our data collection we encourage fellow COVID-19 researchers to independently verify thequality of public data they use if feasible

7

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

countries and the testing regime may change rapidly To prevent this from biasing ourmodel we neglect case numbers before a country has reached 100 confirmed cases anddeath numbers before a country has reached 10 deaths We include all preprocessing stepsin our sensitivity analysis (Appendix C2)

Model

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure 2 Model Overview Purple nodes are observed or have a fixed distribution From bottom to topThe effectiveness of intervention i is represented by αi On each day t a countryrsquos daily reproductionnumber Rt c depends on the countryrsquos basic reproduction number R0c and the active NPIs The activeNPIs are encoded by Φi t c which is 1 if NPI i is active in country c at time t and 0 otherwise Rt c istransformed into the daily growth rate gt c which also depends on the generation interval The growthrate is used to compute the new infections N (C )

t c and N (D)t c that will later be registered as confirmed cases

Ct c and deaths Dt c respectively after a delay Our model uses both death and case data it splits allnodes above the daily growth rate gt c into separate branches for deaths and confirmed cases

Our model uses case and death data from each country to lsquobackwardsrsquo infer the number ofnew infections at each point in time which is itself used to infer the reproduction numbersNPI effects are then estimated by relating the daily reproduction numbers to the activeNPIs across all days and countries This relatively simple lsquodata-drivenrsquo approach allows usto sidestep assumptions about contact patterns and intensity infectiousness of different agegroups and so forth that are typically required in modelling studies Our semi-mechanisticBayesian hierarchical model is based on that of Flaxman et al1 extended to use both caseand death data This increases the amount of data from which we can extract NPI effectsreduces distinct biases of case and death reporting and reduces the bias of only including

8

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

countries that have many deaths Additionally as we do not aim to infer the total number ofCOVID-19 infections we do not assume a specific infection fatality rate (IFR) or ascertain-ment rate (rate of testing) We proceed by summarising the model (Figure 2) A detaileddescription is given in Appendix A Code is available online here

The growth of the epidemic is determined by the time- and country-specific reproductionnumber Rt c which depends on a) the (unobserved) basic reproduction number R0c givenno active NPIs and b) the active NPIs at time t R0c accounts for all time-invariant factorsthat affect transmission in country c such as differences in demographics population den-sity culture and health systems13 We assume that the effect of each NPI on Rt c is stableacross countries and time The effectiveness of NPI i is represented by a parameter αi over which we place a symmetric prior with mean zero allowing both positive and negativeeffects Following Flaxman et al and others1ndash3 each NPIrsquos effect on Rt c is assumed toindependently affect Rt c as a multiplicative factor

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (1)

where φi ct = 1 indicates that NPI i is active in country c on day t (φi ct = 0 otherwise) andI is the number of NPIs The multiplicative effect encodes the plausible assumption thatNPIs have a smaller absolute effect when Rt c is already low We discuss the meaning ofeffectiveness estimates given NPI interactions in the Results section

In the early phase of an epidemic the number of new daily infections grows exponentiallyDuring exponential growth there is a one-to-one correspondence between the daily growthrate and Rt c 14 The correspondence depends on the generation interval (the time betweensuccessive infections in a chain of transmission) which we assume to have a Gamma dis-tribution with mean 667 days11516 We model the daily new infection count separatelyfor confirmed cases and deaths representing those infections which are later reported andthose which are later fatal However both infection numbers are assumed to grow at thesame daily rate in expectation allowing the use of both data sources to estimate each αi The infection numbers translate into reported confirmed cases and deaths after a stochasticdelay which is assumed to be equal across countries The delay is the sum of two indepen-dent gamma distributions assumed to be equal across countries the incubation period andthe delay from onset of symptoms to confirmation We use previously published empiricaldistributions from China and Italy16ndash19 which mutually agree and give a mean infection-to-confirmation delay of 1035 days Similarly the infection-to-death delay is the sum of theincubation period and the (gamma distributed) delay from onset of symptoms to death1720

which sum up to a mean delay of 229 days Finally both the reported deaths and casesfollow a negative binomial noise distribution an inferred dispersion parameter as in relatedNPI models13

Using a Markov chain Monte Carlo (MCMC) sampling algorithm21 this model infers poste-rior distributions of each NPIrsquos effectiveness while accounting for cross-country variationsin testing reporting and fatality rates However it relies on the key assumptions that NPIs

9

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

have equal effects across countries and time and that changes in R are due to the observedNPIs To analyse the extent to which modelling choices affect the results our sensitivityanalysis includes all epidemiological parameters prior distributions and many of the struc-tural assumptions introduced above (Appendix B2 and Appendix C) MCMC convergencestatistics are given in Appendix C

Results

NPI Effectiveness

Our model enables us to estimate the individual effectiveness of each NPI expressed asa percentage reduction in R As in related work1ndash3 this percentage reduction is modelledas constant over countries and time and independent of the other implemented NPIs Inpractice however NPI effectiveness may depend on other implemented NPIs and localcircumstances Thus our effectiveness estimates ought to be interpreted as the effectivenessaveraged over the contexts in which the NPI was implemented in our data5 Our results thusgive the average NPI effectiveness across typical situations that the NPIs were implementedin Figure 3 (bottom left) visualizes which NPIs typically co-occurred aiding interpretation

The mean percentage reduction in R (with 95 credible interval) associated with each NPIis as follows (Figure 3) mandating mask-wearing in (some) public spaces 2 (-14ndash16) limiting gatherings to 1000 people or less 2 (-20ndash22) to 100 people or less21 (1ndash39) to 10 people or less 36 (16ndash53) closing some high-risk businesses31 (13ndash46) closing most nonessential businesses 40 (22ndash55) closing schoolsand universities 39 (21ndash55) and issuing stay-at-home orders 18 (4ndash31)

Some NPIs frequently co-occur ie are collinear However we are able to isolate the ef-fects of individual NPIs since the collinearity is imperfect and our dataset is large For everypair of NPIs we observe one of them without the other for 748 country-days on average(Appendix D3) The minimum number of country-days for any NPI pair is 143 (for lim-iting gatherings to 1000 or 100 attendees) Additionally under excessive collinearity andinsufficient data to overcome it individual effectiveness estimates are highly sensitive tovariations in the data and model parameters22 High sensitivity prevented Flaxman et al1

who had a smaller dataset from disentangling NPI effects6 Our estimates are substantiallyless sensitive (see below) Finally the posterior correlations between the effectiveness esti-mates are weak suggesting manageable collinearity (Appendix D4)

Although the correlations between the individual estimates are weak we should take theminto account when evaluating combined effects of NPIs For example if two NPIs fre-quently co-occur there may be more certainty about the combined effect than about thetwo individual effects Figure 4 shows the combined effectiveness of the sets of NPIs thatare most common in our data All NPIs together reduce R by 82 (79ndash85) Acrossour countries the mean R without any NPIs (ie R0) is 38 matching the mean result ofFlaxman et al1 (Table D2 reports R0 for all countries) Starting from this number the esti-

10

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-100 -50 0 50 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spaces

Gatherings limited to1000 people or less

Gatherings limited to100 people or less

Gatherings limited to10 people or less

Some businessesclosed

Most nonessentialbusinesses closed

Schools and universitiesclosed

Stay-at-home order(with exemptions)

北 便

i

j

Frequency i Active Given j Active

25

50

75

100

0 500 1000 1500 2000 2500 3000

Days

Total Days Active

Figure 3 Top NPI effects The Figure shows the average percentage reductions in R as observed inour data (or in terms of the model the posterior marginal distributions of 1minusexp(minusαi )) with median50 and 95 credible intervals A negative 1 reduction refers to a 1 increase in R Cumulativeeffects are shown for hierarchical NPIs (gathering bans and business closures) ie the result for Mostnonessential businesses closed shows the cumulative effect of two NPIs with separate parameters andsymbols - closing some (high-risk) businesses and additionally closing most remaining (non-high-riskbut nonessential) businesses given that some businesses are already closed Finally we show the jointeffect of closing both schools and universities because the dates of school and university closures nearlyperfectly coincide in our data and we cannot meaningfully isolate their individual effects (AppendixD2) Bottom Left Conditional activation matrix Cell values indicate the frequency that NPI i (x-axis)was active given that NPI j (y-axis) was active Eg schools were always closed whenever a stay-at-home order was active (bottom row third column from the right) but not vice versa Bottom RightTotal number of days each NPI was active across all countries

11

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearingMost businesses closed Schools Closed Universities Closed Stay-at-home order

Some businessesclosed

Figure 4 Combined NPI effectiveness for the most common sets of NPIs in our data by size of the NPI setShaded regions denote 50 and 95 credible intervals Left Maximum R0 that can be reduced to below1 for each set of NPIs Right Predicted R after implementation of each set of NPIs assuming R0 = 38Readers can interactively explore the effects of all sets of NPIs at httpepidemicforecastingorgcalc

mated R can be reduced below 1 by closing schools and universities high-risk businessesand limiting gathering sizes Readers can interactively explore the effects of sets of NPIsat httpepidemicforecastingorgcalc A CSV file containing the joint effectiveness of allpossible NPI combinations is available online here

Validation

We perform a range of experiments to study the robustness and calibration of our NPI effec-tiveness estimates (Appendix B with further experiments in Appendix C) We analyse howthe model extrapolates to unseen countries and periods and perform multiple sensitivityanalyses Amongst other things we analyse how results change if we vary epidemiologicalparameters or vary the data (using only deaths or confirmed cases as observations ex-cluding countries from the data) To investigate our key assumptions we show results forseveral alternative models (structural sensitivity) analyze the role of NPI timing and ex-amine possible confounding of our estimates by unobserved factors influencing R Figure 5summarises these analyses by showing how each NPIrsquos effectiveness is ranked comparedto other NPIs and how its rank is distributed across all experiment conditions The strongagreement across these many analyses increases our confidence in the results while alsoshowing that the precise effectiveness estimates come with additional uncertainty

Discussion

We use a data-driven approach to estimate the effects of eight nonpharmaceutical interven-tions on COVID-19 transmission in 41 countries All eight NPIs together reduce R by 82

12

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

12345678NPI Median Effectiveness Ranking

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Global NPI Rank Stability

25

50

75

100

Figure 5 Ranking of NPIs by median effectiveness across all sensitivity analyses (15 sensitivity analyseswith a total of 96 experiment conditions) In each analysis NPI effects were estimated under severaldifferent plausible variations of the model or the data (Appendix B and Appendix C) The colourindicates in which fraction of all experiment conditions an NPI occupied a given rank The sensitivityanalyses aggregated in this Figure are shown in Figures B8 B9 C11 C12 C13 and B7A

(79ndash85) This finding is in strong agreement with the joint effect estimated in elevencountries by Flaxman et al1 and contributes to the mounting evidence that NPIs can beeffective at mitigating and suppressing outbreaks of COVID-19 Furthermore our resultssuggest that some NPIs outperform others While the exact effectiveness estimates varymildly the qualitative conclusions discussed below are robust across 15 sensitivity analyses

Business closures and gathering bans both seem effective at reducing COVID-19 transmis-sion Closing only high-risk businesses (mean reduction in R 31) appears only somewhatless effective than closing most nonessential businesses (40) making it the more promis-ing policy option in some circumstances Limiting gatherings to 10 people or less (36) wasmore effective than limits up to 100 (21) or 1000 people (2) This may reflect that smallgatherings are common As previously discussed we estimate the average additional effecteach NPI had in the contexts where it was implemented When countries introduced stay-at-home orders they nearly always also banned gatherings and closed schools universitiesand nonessential businesses if they had not done so already Flaxman et al1 and Hsianget al4 add the effect of these distinct NPIs to the effectiveness of stay-at-home orders andaccordingly find a large effect In contrast we and Banholzer et al3 isolate the additionaleffect of ordering the population to stay at home and instead find a smaller effect (18) Atypical country can reduce R to below 1 without a stay-at-home order (Figure 4) providedother NPIs stay active

Mandating mask-wearing in various public spaces had a small positive effect on averagein the countries we studied (2) This does not rule out that mask-wearing has a largereffect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaces

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

wearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier223 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 and open questionsabout different types of masks play a larger role potentially reducing effectiveness Whileour results cast doubt on reports that mask-wearing is the main determinant shaping acountryrsquos epidemic23 the policy still seems promising given all available evidence due toits comparatively low economic and social costs Its effectiveness may increase as otherNPIs are lifted and public interactions return

We find a surprisingly large effect for school and university closures an average 39 re-duction in R This finding is remarkably robust across different model structures variationsin the data and epidemiological assumptions (Figure B7) It remains robust when control-ling for NPIs excluded from our study and the onset time of major government intervention(Figures B9 and C10) Since school and university closures almost perfectly coincide inthe countries we study an approach such as ours cannot distinguish their individual effects(Appendix D2) Furthermore it cannot distinguish direct and indirect effects such asforcing parents to stay at home or causing broader behaviour changes by increasing publicconcern

Previous evidence on school and university closures is mixed1326 Early data suggest thatchildren and young adults are equally susceptible to infection but have a notably lowerobserved case rate than older adultsmdashwhether this is due to school and university closuresremains unknown27ndash29 Although infected young people are often asymptomatic they ap-pear to shed similar amounts of virus as older people3031 and might therefore circulatethe infection to higher-risk demographics unknowingly This also limits our ability to detectlarge outbreaks in educational facilities which closed in nearly all countries before suchdetection was feasible (with exceptions32) As outbreaks detected in UK schools are rapidlyincreasing33 this topic merits further study

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset5

and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B2) However this cannot give final certainty Investigating the role of unob-served effects is an important topic for future investigations Third our results cannot beused without qualification to predict the effect of lifting NPIs For example closing schoolsand universities seems to have greatly reduced transmission but this does not mean that re-opening them will cause infections to soar Educational institutions can (and do) implementsafety measures such as reduced class sizes Further work is needed to analyse the effectsof reopenings our collected data may be instrumental Fourth while we included more

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

NPIs than previous work (Table F4) several promising NPIs were excluded For exampletesting tracing and case isolation may be an important part of a cost-effective epidemicresponse34 but were not included because it is difficult to obtain comprehensive data Wediscuss further limitations in Appendix E

Currently governments across the world are seeking to keep R below 1 while minimising thesocial and economic costs of their interventions We hope that our results can guide policydecisions on which restrictions to lift and which NPIs to implement in any potential secondwave of infections Additionally our results show which areas of public life are most inneed of restructuring so that they can continue despite the pandemic However our resultsshould not be seen as the final answer on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies experimental trialssimulations and clinical experience

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

We thank Jacob Lagerros for operational support and for introducing some of the authorsto each other We thank Maksym Balatsko Marek Pukaj and Tomaacuteš Witzany for developingthe interactive website

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier MA Hartwick LChindelevitch designed the model and modelling experiments M Sharma AB StephensonT Gavenciak J Salvatier performed and analysed the modelling experiments J KulveitT Gavenciak JM Brauner conceived the research S Mindermann T Besiroglu J KulveitJM Brauner did the literature search JM Brauner S Mindermann M Sharma G LeechT Besiroglu V Mikulik wrote the manuscript All authors read and gave feedback on themanuscript and approved the final manuscript JM Brauner S Mindermann and M Sharmacontributed equally Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

3 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

4 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

5 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

6 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

7 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

8 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

9 Johns Hopkins University Center for Systems Science and Engineering COVID-19 DataRepository by the Center for Systems Science and Engineering (CSSE) at Johns HopkinsUniversity httpsgithubcomCSSEGISandDataCOVID-19 2020

10 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

11 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

12 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

13 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

14 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

15 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

19 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

20 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

28 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

29 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

30 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

31 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

32 Arnaud Fontanet et al ldquoCluster of COVID-19 in northern France A retrospective closedcohort studyrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041820071134

33 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 26 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile895356Weekly_COVID19_Surveillance_Report_w26pdf2020

34 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 21

Appendix A1 Detailed model description 21

Appendix A2 Technical Model Description 24

Appendix B Validation 28

Appendix B1 Unseen data 28

Appendix B2 Sensitivity Analysis 28

Appendix B3 Robustness to unobserved effects 31

Appendix C Additional sensitivity analyses and validation 33

Appendix C1 Role of NPI timing 33

Appendix C2 Sensitivity to data preprocessing 34

Appendix C3 Sensitivity to additional epidemiological assumptions 35

Appendix C4 Additional country exclusions 37

Appendix C5 Validation using predictions in excluded countries 38

Appendix C6 MCMC stability results 43

Appendix C7 Posterior predictive distributions 44

Appendix D Additional results 45

Appendix D1 Estimated R0 by country 45

Appendix D2 The individual effects of school and university closures 46

Appendix D3 Collinearity 46

Appendix D4 Correlations between effectiveness estimates 47

Appendix E Additional discussion of assumptions and limitations 49

Appendix E1 Limitations of the data 49

Appendix E2 Model limitations 49

Appendix F Overview of previous work 50

Appendix G Handling edge cases in the data collection 53

Appendix H References 57

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure A6 Model Overview Purple nodes are observed or have a fixed distribution We describe thediagram from bottom to top Each NPIrsquos effectiveness is characterised by αi which is independent ofthe country On each day a countryrsquos daily reproduction number Rt c only depends on that countryrsquosbase reproduction number R0c and the active NPIs (Φi t c ) Rt c is transformed to the daily growth rategt c which is used to compute the new infections N (C )

t c and N (D)t c that will turn into confirmed cases

and deaths respectively Finally the number of new confirmed cases Ct c and deaths Dt c is computedby convolution of N (middot)

t c with the respective delay distributions The same model structure is used forconfirmed cases and deaths The model combines both observations it splits all nodes above the dailygrowth rate gt c into separate branches for deaths and cases

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1 Themain difference is that we model both confirmed cases and deaths allowing us to leveragesignificantly more data Furthermore we do not assume a specific infection fatality rate(IFR) since we do not aim to infer the total number of COVID-19 infections The end of thissection details further adaptations which allow us to minimize assumptions about testingreporting and the IFR A list of all technical details is given in Appendix A2

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior (and hyperprior) distribution over R0c reflecting the wide disagree-ment of regional estimates of R02 We parameterize the effectiveness of NPI i assumed to

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

be same across countries and time with αi Each NPI is assumed to have an independentmultiplicative as on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We use a weakly informative symmetric prior αi simN (002) allowingfor both positive and negative effects because we presently cannot rule out that some NPIsdirectly or indirectly increase transmission

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c andRt c 3

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We assume that the generationinterval distribution is given by a gamma distribution with mean 667 days and standarddeviation 21 The mean is based on an Italian study4 which is deemed most relevant toEuropean countries5 and the standard deviation stems from a international set of countriessince in European countries it has only been estimated for the serial rather than generationinterval to our knowledge67 Using (A2) we can write g t c as g t c (Rt c ) (see Appendix A2)

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are backwards-inferred from the observation models for casesand deaths shown further below We assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)](A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)](A4)

aMany epidemiological models define growth rates as the exponent r in an exponential growth functionHere we use daily growth rates instead for ease of exposition These choices are mathematically equivalentNote that we adapted equation (29) in Wallinga amp Lipsitch3 to account for our choice

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross validation as no reference is available for it We did not tune any otheraspect of the modelmdashinstead we use choices from Flaxman et al1 or the most relevantavailable sources We evaluate five different values (σN isin 00501020304) fitting themodel on 35 countries each time and evaluating on a fixed randomly chosen validationset of 6 countries We select σN = 02 to maximize the log-likelihood on the validation setCross validation ensures a more calibrated model less likely to produce overconfident andunstable estimates10 The final model with σN = 02 is then evaluated on data from 20 held-out days in all countries which were not used to adjust any aspect of the model (FigureB7) We find that different values for σN produce very similar effectiveness estimates butlarger values lead to greater uncertainty and robustness (Figure C12 and Sharma et al9)

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation This delaydistribution is the sum of two independent gamma distributions the incubation periodand the delay from onset of symptoms to confirmation We use previously published andagreeing empirical distributions from China and Italy411ndash13 which sum up to a mean delayof 1035 days Finally the observed cases Ct c follow a negative binomial noise distributionwith mean C t c and an inferred dispersion parameter following Flaxman et al1

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ)

where PD (delay) is the distribution of the delay from infection to death It is also the sumof two independent gamma distributions the aforementioned incubation period and thedelay from onset of symptoms to death1114 which sum up to a mean delay of 229 daysFinally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and the same inferred dispersion parameter used for observed cases

bSince we treat new infections as a continuous number its initial value can (and often should) be between0 and 1

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Single and combined models To construct models which only use either confirmed cases ordeaths as observations we remove the variables corresponding to the disregarded observa-tions

Testing reporting and infection fatality rates Scaling all values of a time series by a con-stant does not change its growth rates The model is therefore invariant to the scale of theobservations and consequently to country-level differences in the IFR and the ascertainmentrate (the proportion of infected people who are subsequently reported positive) For exam-ple assume countries A and B differ only in their ascertainment rates Then our model willinfer a difference in N (C )

t c (Eq (A5)) but not in the growth rates g t c across A and B (Eq(A3)-(A4)) Accordingly the inferred NPI effectiveness will be identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas theintroduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimic changes in the ascertainment ratemdashnoise at time τ

affects all future casesmdashand allows for gradual multiplicative changes in the ascertainmentrate

We infer the unobserved variables in our model using Hamiltonian Monte-Carlo1516 (HMC)a standard MCMC sampling algorithm

Appendix A2 Technical Model Description

Variables are indexed by intervention i country c and day t All prior distributions areindependent

bull Data1 NPI Activations φi t c isin 012 Smoothed Observed Cases Ct c 3 Smoothed Observed Deaths D t c

bull Prior Distributions

cThis is only approximately true The negative binomial output distribution has a coefficient of variationdiminishing with its mean ie smaller observations are relatively more noisy and carry less weight Further-more whilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligibleeffect

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

1 Country-specific R0

R0c = Normal(R0κ) (A6)

R0 = 325 based on a meta analysis17 (A7)

κsim Half Normal(micro= 0σ= 05) (A8)

2 NPI Effectiveness

αi sim Normal(micro= 0σ=p02) (A9)

(A10)

3 Infection Initial Counts (uninformative priors)

N (C )0c = exp(ζ(C )

c ) (A11)

N (D)0c = exp(ζ(D)

c ) (A12)

ζ(C )c sim Normal(micro= 0σ= 50) (A13)

ζ(D)c sim Normal(micro= 0σ= 50) (A14)

(A15)

4 Observation Noise Dispersion Parameter

Ψsim Half Normal(micro= 0σ= 5) (A16)

bull Hyperparameters1 Infection Noise Scale σN = 02 (selected by cross-validation)

bull Epidemiological parameters1 Generation Interval Parameters The generation interval is assumed to have a

Gamma distribution with mean 667 and standard deviation 21 days4ndash6 This leadsto a distribution Gamma(α= 79β= 12)

2 Delay Distributions The time from infection to confirmation is assumed to bethe sum of the incubation period and the time taken from symptom onset to lab-oratory confirmation Therefore the time taken from infection to confirmationT (C ) is411ndash13

T (C ) sim Gamma(micro= 51σ

micro= 086)+Negative Binomial(micro= 525α= 157) (A17)

The time from infection to death is assumed to be the sum of the incubationperiod and the time taken from symptom onset to death Therefore the timetaken from infection to death T (D) is11114

T (D) sim Gamma(micro= 51σ

micro= 086)+Gamma(micro= 178

σ

micro= 045) (A18)

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 5: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Methods

Dataset

We analyse the effects of NPIs (Table 1) in 41 countriesa (see Figure 1) We recorded NPIimplementations when the measures were implemented nationally or in most regions of acountry (affecting at least three fourths of the population) For each country the window ofanalysis starts on the 22nd of January and ends after the first NPI was lifted or on the 30thof May 2020 whichever was earlier The reason to end the analysis after the first majorreopeningb was to avoid a distribution shift For example when schools reopened it wasoften with safety measures such as smaller class sizes and distancing rules It is thereforeexpected that contact patterns in schools will have been different before school closurecompared to after reopening Modelling this difference explicitly is left for future workData on confirmed COVID-19 cases and deaths were taken from the Johns Hopkins CSSECOVID-19 Dataset9 The data used in this study including sources are available onlinehere

aThe countries were selected for their number of cumulative cases surpassing a minimum threshold (at thetime of modelling) the availability of reliable data on NPIs and the trustworthiness of their reporting of deathsFinally we excluded very large countries like China the US and Canada for ease of data collection as thesewould require more locally fine-grained data

bConcretely the window of analysis extended until three days after the first reopening for confirmed casesand 13 days after the first reopening for deaths These values correspond to the 5 quantile of the infection-to-confirmationdeath distributions ensuring that less than 5 of the new infections on the reopening day werestill observed in the window of analysis

5

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table 1 NPIs included in the study Appendix G details how edge cases in the data collection werehandled

NPI DescriptionMask-wearingmandatory in(some) publicspaces

A country has mandated mask usage in the public sometimes limitedto just some public spaces (which the government deems to have a highrisk of infection) For example some countries mandated mask-wearingin most or all indoor public spaces but not outdoors

Gatheringslimited to 1000people or less

A country has set a size limit on gatherings The limit is at most 1000people (often less) and gatherings above the maximum size are disal-lowed For example a ban on gatherings of 500 people or more wouldbe classified as ldquogatherings limited to 1000 or lessrdquo but a ban on gath-erings of 2000 people or more would not

Gatheringslimited to 100people or less

A country has set a size limit on gatherings The limit is at most 100people (often less)

Gatheringslimited to 10people or less

A country has set a size limit on gatherings The limit is at most 10people (often less)

Some businessesclosed

A country has specified a few kinds of customer-facing businesses thatare considered ldquohigh riskrdquo and need to suspend operations (black-list) Common examples are restaurants bars nightclubs cinemasand gyms By default businesses are not suspended

Most nonessentialbusinesses closed

A country has suspended the operations of many customer-facing busi-nesses By default customer-facing businesses are suspended unlessthey are designated as essential (whitelist)

Schools closed A country has closed most or all schoolsUniversitiesclosed

A country has closed most or all universities and higher education fa-cilities

Stay-at-homeorder (withexemptions)

An order for the general public to stay at home has been issued This ismandatory not just a recommendation Exemptions are usually grantedfor certain purposes (such as shopping exercise or going to work) ormore rarely for certain times of the day In practice a stay-at-homeorder was often accompanied by other NPIs such as businesses closuresHowever a stay-at-home order does not in principle entail these otherNPIs but only the (additional) order to generally stay at home exceptfor exemptions

6

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Data collection

We collected data on the start and end date of NPI implementations from the start of thepandemic until the 30th of May 2020 Before collecting the data we experimented withseveral public NPI datasets finding that they were not complete enough for our modellingand contained incorrect datesc By focusing on a smaller set of countries and NPIs than thesedatasets we were able to enforce strong quality controls We used independent doubleentry and manually compared our data to public datasets for cross-checking

First two authors independently researched each country and entered the NPI data into sep-arate spreadsheets The researchers manually researched the dates using internet searchesthere was no automatic component in the data gathering process The average time spentresearching each country per researcher was 15 hours

Second the researchers independently compared their entries to the following public datasetsand if there were conflicts visited all primary sources to resolve the conflict the EFGNPIdatabase10 the Oxford COVID-19 Government Response Tracker11 and the mask4all dataset12

Third each country and NPI was again independently entered by one to three paid con-tractors who were provided with a detailed description of the NPIs and asked to includeprimary sources with their data A researcher then resolved any conflicts between this dataand one (but not both) of the spreadsheets

Finally the two independent spreadsheets were combined and all conflicts resolved by aresearcher The final dataset contains primary sources (government websites andor mediaarticles) for each entry

Data Preprocessing

Data on cases and deaths are noisy Many countries preferentially report deaths and cases oncertain days of the week For example there are days with zero newly confirmed cases eventhough there had been several hundred reported cases on the previous day We thereforesmooth the data using a moving average (on a linear scale) over 2 days into the past andfuture When the case count is small a large fraction of cases may be imported from other

cWe evaluated the following datasetsbull Epidemic Forecasting Global NPI Database10

bull Oxford COVID-19 Government Response Tracker (OxCGRT)11

bull ACAPS COVID19 Government Measures DatasetNote that these datasets are under continuous development Many of the mistakes found will already have beencorrected We know from our own experience that data collection can be very challenging We have the fullestrespect for the people behind these datasets In this paper we focus on a more limited set of countries and NPIsthan these datasets contain allowing us to ensure higher data quality in this subset Given our experience withpublic datasets and our data collection we encourage fellow COVID-19 researchers to independently verify thequality of public data they use if feasible

7

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

countries and the testing regime may change rapidly To prevent this from biasing ourmodel we neglect case numbers before a country has reached 100 confirmed cases anddeath numbers before a country has reached 10 deaths We include all preprocessing stepsin our sensitivity analysis (Appendix C2)

Model

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure 2 Model Overview Purple nodes are observed or have a fixed distribution From bottom to topThe effectiveness of intervention i is represented by αi On each day t a countryrsquos daily reproductionnumber Rt c depends on the countryrsquos basic reproduction number R0c and the active NPIs The activeNPIs are encoded by Φi t c which is 1 if NPI i is active in country c at time t and 0 otherwise Rt c istransformed into the daily growth rate gt c which also depends on the generation interval The growthrate is used to compute the new infections N (C )

t c and N (D)t c that will later be registered as confirmed cases

Ct c and deaths Dt c respectively after a delay Our model uses both death and case data it splits allnodes above the daily growth rate gt c into separate branches for deaths and confirmed cases

Our model uses case and death data from each country to lsquobackwardsrsquo infer the number ofnew infections at each point in time which is itself used to infer the reproduction numbersNPI effects are then estimated by relating the daily reproduction numbers to the activeNPIs across all days and countries This relatively simple lsquodata-drivenrsquo approach allows usto sidestep assumptions about contact patterns and intensity infectiousness of different agegroups and so forth that are typically required in modelling studies Our semi-mechanisticBayesian hierarchical model is based on that of Flaxman et al1 extended to use both caseand death data This increases the amount of data from which we can extract NPI effectsreduces distinct biases of case and death reporting and reduces the bias of only including

8

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

countries that have many deaths Additionally as we do not aim to infer the total number ofCOVID-19 infections we do not assume a specific infection fatality rate (IFR) or ascertain-ment rate (rate of testing) We proceed by summarising the model (Figure 2) A detaileddescription is given in Appendix A Code is available online here

The growth of the epidemic is determined by the time- and country-specific reproductionnumber Rt c which depends on a) the (unobserved) basic reproduction number R0c givenno active NPIs and b) the active NPIs at time t R0c accounts for all time-invariant factorsthat affect transmission in country c such as differences in demographics population den-sity culture and health systems13 We assume that the effect of each NPI on Rt c is stableacross countries and time The effectiveness of NPI i is represented by a parameter αi over which we place a symmetric prior with mean zero allowing both positive and negativeeffects Following Flaxman et al and others1ndash3 each NPIrsquos effect on Rt c is assumed toindependently affect Rt c as a multiplicative factor

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (1)

where φi ct = 1 indicates that NPI i is active in country c on day t (φi ct = 0 otherwise) andI is the number of NPIs The multiplicative effect encodes the plausible assumption thatNPIs have a smaller absolute effect when Rt c is already low We discuss the meaning ofeffectiveness estimates given NPI interactions in the Results section

In the early phase of an epidemic the number of new daily infections grows exponentiallyDuring exponential growth there is a one-to-one correspondence between the daily growthrate and Rt c 14 The correspondence depends on the generation interval (the time betweensuccessive infections in a chain of transmission) which we assume to have a Gamma dis-tribution with mean 667 days11516 We model the daily new infection count separatelyfor confirmed cases and deaths representing those infections which are later reported andthose which are later fatal However both infection numbers are assumed to grow at thesame daily rate in expectation allowing the use of both data sources to estimate each αi The infection numbers translate into reported confirmed cases and deaths after a stochasticdelay which is assumed to be equal across countries The delay is the sum of two indepen-dent gamma distributions assumed to be equal across countries the incubation period andthe delay from onset of symptoms to confirmation We use previously published empiricaldistributions from China and Italy16ndash19 which mutually agree and give a mean infection-to-confirmation delay of 1035 days Similarly the infection-to-death delay is the sum of theincubation period and the (gamma distributed) delay from onset of symptoms to death1720

which sum up to a mean delay of 229 days Finally both the reported deaths and casesfollow a negative binomial noise distribution an inferred dispersion parameter as in relatedNPI models13

Using a Markov chain Monte Carlo (MCMC) sampling algorithm21 this model infers poste-rior distributions of each NPIrsquos effectiveness while accounting for cross-country variationsin testing reporting and fatality rates However it relies on the key assumptions that NPIs

9

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

have equal effects across countries and time and that changes in R are due to the observedNPIs To analyse the extent to which modelling choices affect the results our sensitivityanalysis includes all epidemiological parameters prior distributions and many of the struc-tural assumptions introduced above (Appendix B2 and Appendix C) MCMC convergencestatistics are given in Appendix C

Results

NPI Effectiveness

Our model enables us to estimate the individual effectiveness of each NPI expressed asa percentage reduction in R As in related work1ndash3 this percentage reduction is modelledas constant over countries and time and independent of the other implemented NPIs Inpractice however NPI effectiveness may depend on other implemented NPIs and localcircumstances Thus our effectiveness estimates ought to be interpreted as the effectivenessaveraged over the contexts in which the NPI was implemented in our data5 Our results thusgive the average NPI effectiveness across typical situations that the NPIs were implementedin Figure 3 (bottom left) visualizes which NPIs typically co-occurred aiding interpretation

The mean percentage reduction in R (with 95 credible interval) associated with each NPIis as follows (Figure 3) mandating mask-wearing in (some) public spaces 2 (-14ndash16) limiting gatherings to 1000 people or less 2 (-20ndash22) to 100 people or less21 (1ndash39) to 10 people or less 36 (16ndash53) closing some high-risk businesses31 (13ndash46) closing most nonessential businesses 40 (22ndash55) closing schoolsand universities 39 (21ndash55) and issuing stay-at-home orders 18 (4ndash31)

Some NPIs frequently co-occur ie are collinear However we are able to isolate the ef-fects of individual NPIs since the collinearity is imperfect and our dataset is large For everypair of NPIs we observe one of them without the other for 748 country-days on average(Appendix D3) The minimum number of country-days for any NPI pair is 143 (for lim-iting gatherings to 1000 or 100 attendees) Additionally under excessive collinearity andinsufficient data to overcome it individual effectiveness estimates are highly sensitive tovariations in the data and model parameters22 High sensitivity prevented Flaxman et al1

who had a smaller dataset from disentangling NPI effects6 Our estimates are substantiallyless sensitive (see below) Finally the posterior correlations between the effectiveness esti-mates are weak suggesting manageable collinearity (Appendix D4)

Although the correlations between the individual estimates are weak we should take theminto account when evaluating combined effects of NPIs For example if two NPIs fre-quently co-occur there may be more certainty about the combined effect than about thetwo individual effects Figure 4 shows the combined effectiveness of the sets of NPIs thatare most common in our data All NPIs together reduce R by 82 (79ndash85) Acrossour countries the mean R without any NPIs (ie R0) is 38 matching the mean result ofFlaxman et al1 (Table D2 reports R0 for all countries) Starting from this number the esti-

10

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-100 -50 0 50 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spaces

Gatherings limited to1000 people or less

Gatherings limited to100 people or less

Gatherings limited to10 people or less

Some businessesclosed

Most nonessentialbusinesses closed

Schools and universitiesclosed

Stay-at-home order(with exemptions)

北 便

i

j

Frequency i Active Given j Active

25

50

75

100

0 500 1000 1500 2000 2500 3000

Days

Total Days Active

Figure 3 Top NPI effects The Figure shows the average percentage reductions in R as observed inour data (or in terms of the model the posterior marginal distributions of 1minusexp(minusαi )) with median50 and 95 credible intervals A negative 1 reduction refers to a 1 increase in R Cumulativeeffects are shown for hierarchical NPIs (gathering bans and business closures) ie the result for Mostnonessential businesses closed shows the cumulative effect of two NPIs with separate parameters andsymbols - closing some (high-risk) businesses and additionally closing most remaining (non-high-riskbut nonessential) businesses given that some businesses are already closed Finally we show the jointeffect of closing both schools and universities because the dates of school and university closures nearlyperfectly coincide in our data and we cannot meaningfully isolate their individual effects (AppendixD2) Bottom Left Conditional activation matrix Cell values indicate the frequency that NPI i (x-axis)was active given that NPI j (y-axis) was active Eg schools were always closed whenever a stay-at-home order was active (bottom row third column from the right) but not vice versa Bottom RightTotal number of days each NPI was active across all countries

11

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearingMost businesses closed Schools Closed Universities Closed Stay-at-home order

Some businessesclosed

Figure 4 Combined NPI effectiveness for the most common sets of NPIs in our data by size of the NPI setShaded regions denote 50 and 95 credible intervals Left Maximum R0 that can be reduced to below1 for each set of NPIs Right Predicted R after implementation of each set of NPIs assuming R0 = 38Readers can interactively explore the effects of all sets of NPIs at httpepidemicforecastingorgcalc

mated R can be reduced below 1 by closing schools and universities high-risk businessesand limiting gathering sizes Readers can interactively explore the effects of sets of NPIsat httpepidemicforecastingorgcalc A CSV file containing the joint effectiveness of allpossible NPI combinations is available online here

Validation

We perform a range of experiments to study the robustness and calibration of our NPI effec-tiveness estimates (Appendix B with further experiments in Appendix C) We analyse howthe model extrapolates to unseen countries and periods and perform multiple sensitivityanalyses Amongst other things we analyse how results change if we vary epidemiologicalparameters or vary the data (using only deaths or confirmed cases as observations ex-cluding countries from the data) To investigate our key assumptions we show results forseveral alternative models (structural sensitivity) analyze the role of NPI timing and ex-amine possible confounding of our estimates by unobserved factors influencing R Figure 5summarises these analyses by showing how each NPIrsquos effectiveness is ranked comparedto other NPIs and how its rank is distributed across all experiment conditions The strongagreement across these many analyses increases our confidence in the results while alsoshowing that the precise effectiveness estimates come with additional uncertainty

Discussion

We use a data-driven approach to estimate the effects of eight nonpharmaceutical interven-tions on COVID-19 transmission in 41 countries All eight NPIs together reduce R by 82

12

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

12345678NPI Median Effectiveness Ranking

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Global NPI Rank Stability

25

50

75

100

Figure 5 Ranking of NPIs by median effectiveness across all sensitivity analyses (15 sensitivity analyseswith a total of 96 experiment conditions) In each analysis NPI effects were estimated under severaldifferent plausible variations of the model or the data (Appendix B and Appendix C) The colourindicates in which fraction of all experiment conditions an NPI occupied a given rank The sensitivityanalyses aggregated in this Figure are shown in Figures B8 B9 C11 C12 C13 and B7A

(79ndash85) This finding is in strong agreement with the joint effect estimated in elevencountries by Flaxman et al1 and contributes to the mounting evidence that NPIs can beeffective at mitigating and suppressing outbreaks of COVID-19 Furthermore our resultssuggest that some NPIs outperform others While the exact effectiveness estimates varymildly the qualitative conclusions discussed below are robust across 15 sensitivity analyses

Business closures and gathering bans both seem effective at reducing COVID-19 transmis-sion Closing only high-risk businesses (mean reduction in R 31) appears only somewhatless effective than closing most nonessential businesses (40) making it the more promis-ing policy option in some circumstances Limiting gatherings to 10 people or less (36) wasmore effective than limits up to 100 (21) or 1000 people (2) This may reflect that smallgatherings are common As previously discussed we estimate the average additional effecteach NPI had in the contexts where it was implemented When countries introduced stay-at-home orders they nearly always also banned gatherings and closed schools universitiesand nonessential businesses if they had not done so already Flaxman et al1 and Hsianget al4 add the effect of these distinct NPIs to the effectiveness of stay-at-home orders andaccordingly find a large effect In contrast we and Banholzer et al3 isolate the additionaleffect of ordering the population to stay at home and instead find a smaller effect (18) Atypical country can reduce R to below 1 without a stay-at-home order (Figure 4) providedother NPIs stay active

Mandating mask-wearing in various public spaces had a small positive effect on averagein the countries we studied (2) This does not rule out that mask-wearing has a largereffect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaces

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

wearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier223 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 and open questionsabout different types of masks play a larger role potentially reducing effectiveness Whileour results cast doubt on reports that mask-wearing is the main determinant shaping acountryrsquos epidemic23 the policy still seems promising given all available evidence due toits comparatively low economic and social costs Its effectiveness may increase as otherNPIs are lifted and public interactions return

We find a surprisingly large effect for school and university closures an average 39 re-duction in R This finding is remarkably robust across different model structures variationsin the data and epidemiological assumptions (Figure B7) It remains robust when control-ling for NPIs excluded from our study and the onset time of major government intervention(Figures B9 and C10) Since school and university closures almost perfectly coincide inthe countries we study an approach such as ours cannot distinguish their individual effects(Appendix D2) Furthermore it cannot distinguish direct and indirect effects such asforcing parents to stay at home or causing broader behaviour changes by increasing publicconcern

Previous evidence on school and university closures is mixed1326 Early data suggest thatchildren and young adults are equally susceptible to infection but have a notably lowerobserved case rate than older adultsmdashwhether this is due to school and university closuresremains unknown27ndash29 Although infected young people are often asymptomatic they ap-pear to shed similar amounts of virus as older people3031 and might therefore circulatethe infection to higher-risk demographics unknowingly This also limits our ability to detectlarge outbreaks in educational facilities which closed in nearly all countries before suchdetection was feasible (with exceptions32) As outbreaks detected in UK schools are rapidlyincreasing33 this topic merits further study

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset5

and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B2) However this cannot give final certainty Investigating the role of unob-served effects is an important topic for future investigations Third our results cannot beused without qualification to predict the effect of lifting NPIs For example closing schoolsand universities seems to have greatly reduced transmission but this does not mean that re-opening them will cause infections to soar Educational institutions can (and do) implementsafety measures such as reduced class sizes Further work is needed to analyse the effectsof reopenings our collected data may be instrumental Fourth while we included more

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

NPIs than previous work (Table F4) several promising NPIs were excluded For exampletesting tracing and case isolation may be an important part of a cost-effective epidemicresponse34 but were not included because it is difficult to obtain comprehensive data Wediscuss further limitations in Appendix E

Currently governments across the world are seeking to keep R below 1 while minimising thesocial and economic costs of their interventions We hope that our results can guide policydecisions on which restrictions to lift and which NPIs to implement in any potential secondwave of infections Additionally our results show which areas of public life are most inneed of restructuring so that they can continue despite the pandemic However our resultsshould not be seen as the final answer on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies experimental trialssimulations and clinical experience

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

We thank Jacob Lagerros for operational support and for introducing some of the authorsto each other We thank Maksym Balatsko Marek Pukaj and Tomaacuteš Witzany for developingthe interactive website

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier MA Hartwick LChindelevitch designed the model and modelling experiments M Sharma AB StephensonT Gavenciak J Salvatier performed and analysed the modelling experiments J KulveitT Gavenciak JM Brauner conceived the research S Mindermann T Besiroglu J KulveitJM Brauner did the literature search JM Brauner S Mindermann M Sharma G LeechT Besiroglu V Mikulik wrote the manuscript All authors read and gave feedback on themanuscript and approved the final manuscript JM Brauner S Mindermann and M Sharmacontributed equally Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

3 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

4 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

5 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

6 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

7 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

8 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

9 Johns Hopkins University Center for Systems Science and Engineering COVID-19 DataRepository by the Center for Systems Science and Engineering (CSSE) at Johns HopkinsUniversity httpsgithubcomCSSEGISandDataCOVID-19 2020

10 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

11 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

12 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

13 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

14 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

15 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

19 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

20 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

28 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

29 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

30 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

31 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

32 Arnaud Fontanet et al ldquoCluster of COVID-19 in northern France A retrospective closedcohort studyrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041820071134

33 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 26 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile895356Weekly_COVID19_Surveillance_Report_w26pdf2020

34 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 21

Appendix A1 Detailed model description 21

Appendix A2 Technical Model Description 24

Appendix B Validation 28

Appendix B1 Unseen data 28

Appendix B2 Sensitivity Analysis 28

Appendix B3 Robustness to unobserved effects 31

Appendix C Additional sensitivity analyses and validation 33

Appendix C1 Role of NPI timing 33

Appendix C2 Sensitivity to data preprocessing 34

Appendix C3 Sensitivity to additional epidemiological assumptions 35

Appendix C4 Additional country exclusions 37

Appendix C5 Validation using predictions in excluded countries 38

Appendix C6 MCMC stability results 43

Appendix C7 Posterior predictive distributions 44

Appendix D Additional results 45

Appendix D1 Estimated R0 by country 45

Appendix D2 The individual effects of school and university closures 46

Appendix D3 Collinearity 46

Appendix D4 Correlations between effectiveness estimates 47

Appendix E Additional discussion of assumptions and limitations 49

Appendix E1 Limitations of the data 49

Appendix E2 Model limitations 49

Appendix F Overview of previous work 50

Appendix G Handling edge cases in the data collection 53

Appendix H References 57

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure A6 Model Overview Purple nodes are observed or have a fixed distribution We describe thediagram from bottom to top Each NPIrsquos effectiveness is characterised by αi which is independent ofthe country On each day a countryrsquos daily reproduction number Rt c only depends on that countryrsquosbase reproduction number R0c and the active NPIs (Φi t c ) Rt c is transformed to the daily growth rategt c which is used to compute the new infections N (C )

t c and N (D)t c that will turn into confirmed cases

and deaths respectively Finally the number of new confirmed cases Ct c and deaths Dt c is computedby convolution of N (middot)

t c with the respective delay distributions The same model structure is used forconfirmed cases and deaths The model combines both observations it splits all nodes above the dailygrowth rate gt c into separate branches for deaths and cases

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1 Themain difference is that we model both confirmed cases and deaths allowing us to leveragesignificantly more data Furthermore we do not assume a specific infection fatality rate(IFR) since we do not aim to infer the total number of COVID-19 infections The end of thissection details further adaptations which allow us to minimize assumptions about testingreporting and the IFR A list of all technical details is given in Appendix A2

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior (and hyperprior) distribution over R0c reflecting the wide disagree-ment of regional estimates of R02 We parameterize the effectiveness of NPI i assumed to

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

be same across countries and time with αi Each NPI is assumed to have an independentmultiplicative as on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We use a weakly informative symmetric prior αi simN (002) allowingfor both positive and negative effects because we presently cannot rule out that some NPIsdirectly or indirectly increase transmission

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c andRt c 3

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We assume that the generationinterval distribution is given by a gamma distribution with mean 667 days and standarddeviation 21 The mean is based on an Italian study4 which is deemed most relevant toEuropean countries5 and the standard deviation stems from a international set of countriessince in European countries it has only been estimated for the serial rather than generationinterval to our knowledge67 Using (A2) we can write g t c as g t c (Rt c ) (see Appendix A2)

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are backwards-inferred from the observation models for casesand deaths shown further below We assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)](A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)](A4)

aMany epidemiological models define growth rates as the exponent r in an exponential growth functionHere we use daily growth rates instead for ease of exposition These choices are mathematically equivalentNote that we adapted equation (29) in Wallinga amp Lipsitch3 to account for our choice

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross validation as no reference is available for it We did not tune any otheraspect of the modelmdashinstead we use choices from Flaxman et al1 or the most relevantavailable sources We evaluate five different values (σN isin 00501020304) fitting themodel on 35 countries each time and evaluating on a fixed randomly chosen validationset of 6 countries We select σN = 02 to maximize the log-likelihood on the validation setCross validation ensures a more calibrated model less likely to produce overconfident andunstable estimates10 The final model with σN = 02 is then evaluated on data from 20 held-out days in all countries which were not used to adjust any aspect of the model (FigureB7) We find that different values for σN produce very similar effectiveness estimates butlarger values lead to greater uncertainty and robustness (Figure C12 and Sharma et al9)

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation This delaydistribution is the sum of two independent gamma distributions the incubation periodand the delay from onset of symptoms to confirmation We use previously published andagreeing empirical distributions from China and Italy411ndash13 which sum up to a mean delayof 1035 days Finally the observed cases Ct c follow a negative binomial noise distributionwith mean C t c and an inferred dispersion parameter following Flaxman et al1

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ)

where PD (delay) is the distribution of the delay from infection to death It is also the sumof two independent gamma distributions the aforementioned incubation period and thedelay from onset of symptoms to death1114 which sum up to a mean delay of 229 daysFinally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and the same inferred dispersion parameter used for observed cases

bSince we treat new infections as a continuous number its initial value can (and often should) be between0 and 1

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Single and combined models To construct models which only use either confirmed cases ordeaths as observations we remove the variables corresponding to the disregarded observa-tions

Testing reporting and infection fatality rates Scaling all values of a time series by a con-stant does not change its growth rates The model is therefore invariant to the scale of theobservations and consequently to country-level differences in the IFR and the ascertainmentrate (the proportion of infected people who are subsequently reported positive) For exam-ple assume countries A and B differ only in their ascertainment rates Then our model willinfer a difference in N (C )

t c (Eq (A5)) but not in the growth rates g t c across A and B (Eq(A3)-(A4)) Accordingly the inferred NPI effectiveness will be identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas theintroduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimic changes in the ascertainment ratemdashnoise at time τ

affects all future casesmdashand allows for gradual multiplicative changes in the ascertainmentrate

We infer the unobserved variables in our model using Hamiltonian Monte-Carlo1516 (HMC)a standard MCMC sampling algorithm

Appendix A2 Technical Model Description

Variables are indexed by intervention i country c and day t All prior distributions areindependent

bull Data1 NPI Activations φi t c isin 012 Smoothed Observed Cases Ct c 3 Smoothed Observed Deaths D t c

bull Prior Distributions

cThis is only approximately true The negative binomial output distribution has a coefficient of variationdiminishing with its mean ie smaller observations are relatively more noisy and carry less weight Further-more whilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligibleeffect

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

1 Country-specific R0

R0c = Normal(R0κ) (A6)

R0 = 325 based on a meta analysis17 (A7)

κsim Half Normal(micro= 0σ= 05) (A8)

2 NPI Effectiveness

αi sim Normal(micro= 0σ=p02) (A9)

(A10)

3 Infection Initial Counts (uninformative priors)

N (C )0c = exp(ζ(C )

c ) (A11)

N (D)0c = exp(ζ(D)

c ) (A12)

ζ(C )c sim Normal(micro= 0σ= 50) (A13)

ζ(D)c sim Normal(micro= 0σ= 50) (A14)

(A15)

4 Observation Noise Dispersion Parameter

Ψsim Half Normal(micro= 0σ= 5) (A16)

bull Hyperparameters1 Infection Noise Scale σN = 02 (selected by cross-validation)

bull Epidemiological parameters1 Generation Interval Parameters The generation interval is assumed to have a

Gamma distribution with mean 667 and standard deviation 21 days4ndash6 This leadsto a distribution Gamma(α= 79β= 12)

2 Delay Distributions The time from infection to confirmation is assumed to bethe sum of the incubation period and the time taken from symptom onset to lab-oratory confirmation Therefore the time taken from infection to confirmationT (C ) is411ndash13

T (C ) sim Gamma(micro= 51σ

micro= 086)+Negative Binomial(micro= 525α= 157) (A17)

The time from infection to death is assumed to be the sum of the incubationperiod and the time taken from symptom onset to death Therefore the timetaken from infection to death T (D) is11114

T (D) sim Gamma(micro= 51σ

micro= 086)+Gamma(micro= 178

σ

micro= 045) (A18)

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 6: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Table 1 NPIs included in the study Appendix G details how edge cases in the data collection werehandled

NPI DescriptionMask-wearingmandatory in(some) publicspaces

A country has mandated mask usage in the public sometimes limitedto just some public spaces (which the government deems to have a highrisk of infection) For example some countries mandated mask-wearingin most or all indoor public spaces but not outdoors

Gatheringslimited to 1000people or less

A country has set a size limit on gatherings The limit is at most 1000people (often less) and gatherings above the maximum size are disal-lowed For example a ban on gatherings of 500 people or more wouldbe classified as ldquogatherings limited to 1000 or lessrdquo but a ban on gath-erings of 2000 people or more would not

Gatheringslimited to 100people or less

A country has set a size limit on gatherings The limit is at most 100people (often less)

Gatheringslimited to 10people or less

A country has set a size limit on gatherings The limit is at most 10people (often less)

Some businessesclosed

A country has specified a few kinds of customer-facing businesses thatare considered ldquohigh riskrdquo and need to suspend operations (black-list) Common examples are restaurants bars nightclubs cinemasand gyms By default businesses are not suspended

Most nonessentialbusinesses closed

A country has suspended the operations of many customer-facing busi-nesses By default customer-facing businesses are suspended unlessthey are designated as essential (whitelist)

Schools closed A country has closed most or all schoolsUniversitiesclosed

A country has closed most or all universities and higher education fa-cilities

Stay-at-homeorder (withexemptions)

An order for the general public to stay at home has been issued This ismandatory not just a recommendation Exemptions are usually grantedfor certain purposes (such as shopping exercise or going to work) ormore rarely for certain times of the day In practice a stay-at-homeorder was often accompanied by other NPIs such as businesses closuresHowever a stay-at-home order does not in principle entail these otherNPIs but only the (additional) order to generally stay at home exceptfor exemptions

6

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Data collection

We collected data on the start and end date of NPI implementations from the start of thepandemic until the 30th of May 2020 Before collecting the data we experimented withseveral public NPI datasets finding that they were not complete enough for our modellingand contained incorrect datesc By focusing on a smaller set of countries and NPIs than thesedatasets we were able to enforce strong quality controls We used independent doubleentry and manually compared our data to public datasets for cross-checking

First two authors independently researched each country and entered the NPI data into sep-arate spreadsheets The researchers manually researched the dates using internet searchesthere was no automatic component in the data gathering process The average time spentresearching each country per researcher was 15 hours

Second the researchers independently compared their entries to the following public datasetsand if there were conflicts visited all primary sources to resolve the conflict the EFGNPIdatabase10 the Oxford COVID-19 Government Response Tracker11 and the mask4all dataset12

Third each country and NPI was again independently entered by one to three paid con-tractors who were provided with a detailed description of the NPIs and asked to includeprimary sources with their data A researcher then resolved any conflicts between this dataand one (but not both) of the spreadsheets

Finally the two independent spreadsheets were combined and all conflicts resolved by aresearcher The final dataset contains primary sources (government websites andor mediaarticles) for each entry

Data Preprocessing

Data on cases and deaths are noisy Many countries preferentially report deaths and cases oncertain days of the week For example there are days with zero newly confirmed cases eventhough there had been several hundred reported cases on the previous day We thereforesmooth the data using a moving average (on a linear scale) over 2 days into the past andfuture When the case count is small a large fraction of cases may be imported from other

cWe evaluated the following datasetsbull Epidemic Forecasting Global NPI Database10

bull Oxford COVID-19 Government Response Tracker (OxCGRT)11

bull ACAPS COVID19 Government Measures DatasetNote that these datasets are under continuous development Many of the mistakes found will already have beencorrected We know from our own experience that data collection can be very challenging We have the fullestrespect for the people behind these datasets In this paper we focus on a more limited set of countries and NPIsthan these datasets contain allowing us to ensure higher data quality in this subset Given our experience withpublic datasets and our data collection we encourage fellow COVID-19 researchers to independently verify thequality of public data they use if feasible

7

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

countries and the testing regime may change rapidly To prevent this from biasing ourmodel we neglect case numbers before a country has reached 100 confirmed cases anddeath numbers before a country has reached 10 deaths We include all preprocessing stepsin our sensitivity analysis (Appendix C2)

Model

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure 2 Model Overview Purple nodes are observed or have a fixed distribution From bottom to topThe effectiveness of intervention i is represented by αi On each day t a countryrsquos daily reproductionnumber Rt c depends on the countryrsquos basic reproduction number R0c and the active NPIs The activeNPIs are encoded by Φi t c which is 1 if NPI i is active in country c at time t and 0 otherwise Rt c istransformed into the daily growth rate gt c which also depends on the generation interval The growthrate is used to compute the new infections N (C )

t c and N (D)t c that will later be registered as confirmed cases

Ct c and deaths Dt c respectively after a delay Our model uses both death and case data it splits allnodes above the daily growth rate gt c into separate branches for deaths and confirmed cases

Our model uses case and death data from each country to lsquobackwardsrsquo infer the number ofnew infections at each point in time which is itself used to infer the reproduction numbersNPI effects are then estimated by relating the daily reproduction numbers to the activeNPIs across all days and countries This relatively simple lsquodata-drivenrsquo approach allows usto sidestep assumptions about contact patterns and intensity infectiousness of different agegroups and so forth that are typically required in modelling studies Our semi-mechanisticBayesian hierarchical model is based on that of Flaxman et al1 extended to use both caseand death data This increases the amount of data from which we can extract NPI effectsreduces distinct biases of case and death reporting and reduces the bias of only including

8

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

countries that have many deaths Additionally as we do not aim to infer the total number ofCOVID-19 infections we do not assume a specific infection fatality rate (IFR) or ascertain-ment rate (rate of testing) We proceed by summarising the model (Figure 2) A detaileddescription is given in Appendix A Code is available online here

The growth of the epidemic is determined by the time- and country-specific reproductionnumber Rt c which depends on a) the (unobserved) basic reproduction number R0c givenno active NPIs and b) the active NPIs at time t R0c accounts for all time-invariant factorsthat affect transmission in country c such as differences in demographics population den-sity culture and health systems13 We assume that the effect of each NPI on Rt c is stableacross countries and time The effectiveness of NPI i is represented by a parameter αi over which we place a symmetric prior with mean zero allowing both positive and negativeeffects Following Flaxman et al and others1ndash3 each NPIrsquos effect on Rt c is assumed toindependently affect Rt c as a multiplicative factor

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (1)

where φi ct = 1 indicates that NPI i is active in country c on day t (φi ct = 0 otherwise) andI is the number of NPIs The multiplicative effect encodes the plausible assumption thatNPIs have a smaller absolute effect when Rt c is already low We discuss the meaning ofeffectiveness estimates given NPI interactions in the Results section

In the early phase of an epidemic the number of new daily infections grows exponentiallyDuring exponential growth there is a one-to-one correspondence between the daily growthrate and Rt c 14 The correspondence depends on the generation interval (the time betweensuccessive infections in a chain of transmission) which we assume to have a Gamma dis-tribution with mean 667 days11516 We model the daily new infection count separatelyfor confirmed cases and deaths representing those infections which are later reported andthose which are later fatal However both infection numbers are assumed to grow at thesame daily rate in expectation allowing the use of both data sources to estimate each αi The infection numbers translate into reported confirmed cases and deaths after a stochasticdelay which is assumed to be equal across countries The delay is the sum of two indepen-dent gamma distributions assumed to be equal across countries the incubation period andthe delay from onset of symptoms to confirmation We use previously published empiricaldistributions from China and Italy16ndash19 which mutually agree and give a mean infection-to-confirmation delay of 1035 days Similarly the infection-to-death delay is the sum of theincubation period and the (gamma distributed) delay from onset of symptoms to death1720

which sum up to a mean delay of 229 days Finally both the reported deaths and casesfollow a negative binomial noise distribution an inferred dispersion parameter as in relatedNPI models13

Using a Markov chain Monte Carlo (MCMC) sampling algorithm21 this model infers poste-rior distributions of each NPIrsquos effectiveness while accounting for cross-country variationsin testing reporting and fatality rates However it relies on the key assumptions that NPIs

9

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

have equal effects across countries and time and that changes in R are due to the observedNPIs To analyse the extent to which modelling choices affect the results our sensitivityanalysis includes all epidemiological parameters prior distributions and many of the struc-tural assumptions introduced above (Appendix B2 and Appendix C) MCMC convergencestatistics are given in Appendix C

Results

NPI Effectiveness

Our model enables us to estimate the individual effectiveness of each NPI expressed asa percentage reduction in R As in related work1ndash3 this percentage reduction is modelledas constant over countries and time and independent of the other implemented NPIs Inpractice however NPI effectiveness may depend on other implemented NPIs and localcircumstances Thus our effectiveness estimates ought to be interpreted as the effectivenessaveraged over the contexts in which the NPI was implemented in our data5 Our results thusgive the average NPI effectiveness across typical situations that the NPIs were implementedin Figure 3 (bottom left) visualizes which NPIs typically co-occurred aiding interpretation

The mean percentage reduction in R (with 95 credible interval) associated with each NPIis as follows (Figure 3) mandating mask-wearing in (some) public spaces 2 (-14ndash16) limiting gatherings to 1000 people or less 2 (-20ndash22) to 100 people or less21 (1ndash39) to 10 people or less 36 (16ndash53) closing some high-risk businesses31 (13ndash46) closing most nonessential businesses 40 (22ndash55) closing schoolsand universities 39 (21ndash55) and issuing stay-at-home orders 18 (4ndash31)

Some NPIs frequently co-occur ie are collinear However we are able to isolate the ef-fects of individual NPIs since the collinearity is imperfect and our dataset is large For everypair of NPIs we observe one of them without the other for 748 country-days on average(Appendix D3) The minimum number of country-days for any NPI pair is 143 (for lim-iting gatherings to 1000 or 100 attendees) Additionally under excessive collinearity andinsufficient data to overcome it individual effectiveness estimates are highly sensitive tovariations in the data and model parameters22 High sensitivity prevented Flaxman et al1

who had a smaller dataset from disentangling NPI effects6 Our estimates are substantiallyless sensitive (see below) Finally the posterior correlations between the effectiveness esti-mates are weak suggesting manageable collinearity (Appendix D4)

Although the correlations between the individual estimates are weak we should take theminto account when evaluating combined effects of NPIs For example if two NPIs fre-quently co-occur there may be more certainty about the combined effect than about thetwo individual effects Figure 4 shows the combined effectiveness of the sets of NPIs thatare most common in our data All NPIs together reduce R by 82 (79ndash85) Acrossour countries the mean R without any NPIs (ie R0) is 38 matching the mean result ofFlaxman et al1 (Table D2 reports R0 for all countries) Starting from this number the esti-

10

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-100 -50 0 50 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spaces

Gatherings limited to1000 people or less

Gatherings limited to100 people or less

Gatherings limited to10 people or less

Some businessesclosed

Most nonessentialbusinesses closed

Schools and universitiesclosed

Stay-at-home order(with exemptions)

北 便

i

j

Frequency i Active Given j Active

25

50

75

100

0 500 1000 1500 2000 2500 3000

Days

Total Days Active

Figure 3 Top NPI effects The Figure shows the average percentage reductions in R as observed inour data (or in terms of the model the posterior marginal distributions of 1minusexp(minusαi )) with median50 and 95 credible intervals A negative 1 reduction refers to a 1 increase in R Cumulativeeffects are shown for hierarchical NPIs (gathering bans and business closures) ie the result for Mostnonessential businesses closed shows the cumulative effect of two NPIs with separate parameters andsymbols - closing some (high-risk) businesses and additionally closing most remaining (non-high-riskbut nonessential) businesses given that some businesses are already closed Finally we show the jointeffect of closing both schools and universities because the dates of school and university closures nearlyperfectly coincide in our data and we cannot meaningfully isolate their individual effects (AppendixD2) Bottom Left Conditional activation matrix Cell values indicate the frequency that NPI i (x-axis)was active given that NPI j (y-axis) was active Eg schools were always closed whenever a stay-at-home order was active (bottom row third column from the right) but not vice versa Bottom RightTotal number of days each NPI was active across all countries

11

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearingMost businesses closed Schools Closed Universities Closed Stay-at-home order

Some businessesclosed

Figure 4 Combined NPI effectiveness for the most common sets of NPIs in our data by size of the NPI setShaded regions denote 50 and 95 credible intervals Left Maximum R0 that can be reduced to below1 for each set of NPIs Right Predicted R after implementation of each set of NPIs assuming R0 = 38Readers can interactively explore the effects of all sets of NPIs at httpepidemicforecastingorgcalc

mated R can be reduced below 1 by closing schools and universities high-risk businessesand limiting gathering sizes Readers can interactively explore the effects of sets of NPIsat httpepidemicforecastingorgcalc A CSV file containing the joint effectiveness of allpossible NPI combinations is available online here

Validation

We perform a range of experiments to study the robustness and calibration of our NPI effec-tiveness estimates (Appendix B with further experiments in Appendix C) We analyse howthe model extrapolates to unseen countries and periods and perform multiple sensitivityanalyses Amongst other things we analyse how results change if we vary epidemiologicalparameters or vary the data (using only deaths or confirmed cases as observations ex-cluding countries from the data) To investigate our key assumptions we show results forseveral alternative models (structural sensitivity) analyze the role of NPI timing and ex-amine possible confounding of our estimates by unobserved factors influencing R Figure 5summarises these analyses by showing how each NPIrsquos effectiveness is ranked comparedto other NPIs and how its rank is distributed across all experiment conditions The strongagreement across these many analyses increases our confidence in the results while alsoshowing that the precise effectiveness estimates come with additional uncertainty

Discussion

We use a data-driven approach to estimate the effects of eight nonpharmaceutical interven-tions on COVID-19 transmission in 41 countries All eight NPIs together reduce R by 82

12

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

12345678NPI Median Effectiveness Ranking

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Global NPI Rank Stability

25

50

75

100

Figure 5 Ranking of NPIs by median effectiveness across all sensitivity analyses (15 sensitivity analyseswith a total of 96 experiment conditions) In each analysis NPI effects were estimated under severaldifferent plausible variations of the model or the data (Appendix B and Appendix C) The colourindicates in which fraction of all experiment conditions an NPI occupied a given rank The sensitivityanalyses aggregated in this Figure are shown in Figures B8 B9 C11 C12 C13 and B7A

(79ndash85) This finding is in strong agreement with the joint effect estimated in elevencountries by Flaxman et al1 and contributes to the mounting evidence that NPIs can beeffective at mitigating and suppressing outbreaks of COVID-19 Furthermore our resultssuggest that some NPIs outperform others While the exact effectiveness estimates varymildly the qualitative conclusions discussed below are robust across 15 sensitivity analyses

Business closures and gathering bans both seem effective at reducing COVID-19 transmis-sion Closing only high-risk businesses (mean reduction in R 31) appears only somewhatless effective than closing most nonessential businesses (40) making it the more promis-ing policy option in some circumstances Limiting gatherings to 10 people or less (36) wasmore effective than limits up to 100 (21) or 1000 people (2) This may reflect that smallgatherings are common As previously discussed we estimate the average additional effecteach NPI had in the contexts where it was implemented When countries introduced stay-at-home orders they nearly always also banned gatherings and closed schools universitiesand nonessential businesses if they had not done so already Flaxman et al1 and Hsianget al4 add the effect of these distinct NPIs to the effectiveness of stay-at-home orders andaccordingly find a large effect In contrast we and Banholzer et al3 isolate the additionaleffect of ordering the population to stay at home and instead find a smaller effect (18) Atypical country can reduce R to below 1 without a stay-at-home order (Figure 4) providedother NPIs stay active

Mandating mask-wearing in various public spaces had a small positive effect on averagein the countries we studied (2) This does not rule out that mask-wearing has a largereffect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaces

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

wearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier223 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 and open questionsabout different types of masks play a larger role potentially reducing effectiveness Whileour results cast doubt on reports that mask-wearing is the main determinant shaping acountryrsquos epidemic23 the policy still seems promising given all available evidence due toits comparatively low economic and social costs Its effectiveness may increase as otherNPIs are lifted and public interactions return

We find a surprisingly large effect for school and university closures an average 39 re-duction in R This finding is remarkably robust across different model structures variationsin the data and epidemiological assumptions (Figure B7) It remains robust when control-ling for NPIs excluded from our study and the onset time of major government intervention(Figures B9 and C10) Since school and university closures almost perfectly coincide inthe countries we study an approach such as ours cannot distinguish their individual effects(Appendix D2) Furthermore it cannot distinguish direct and indirect effects such asforcing parents to stay at home or causing broader behaviour changes by increasing publicconcern

Previous evidence on school and university closures is mixed1326 Early data suggest thatchildren and young adults are equally susceptible to infection but have a notably lowerobserved case rate than older adultsmdashwhether this is due to school and university closuresremains unknown27ndash29 Although infected young people are often asymptomatic they ap-pear to shed similar amounts of virus as older people3031 and might therefore circulatethe infection to higher-risk demographics unknowingly This also limits our ability to detectlarge outbreaks in educational facilities which closed in nearly all countries before suchdetection was feasible (with exceptions32) As outbreaks detected in UK schools are rapidlyincreasing33 this topic merits further study

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset5

and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B2) However this cannot give final certainty Investigating the role of unob-served effects is an important topic for future investigations Third our results cannot beused without qualification to predict the effect of lifting NPIs For example closing schoolsand universities seems to have greatly reduced transmission but this does not mean that re-opening them will cause infections to soar Educational institutions can (and do) implementsafety measures such as reduced class sizes Further work is needed to analyse the effectsof reopenings our collected data may be instrumental Fourth while we included more

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

NPIs than previous work (Table F4) several promising NPIs were excluded For exampletesting tracing and case isolation may be an important part of a cost-effective epidemicresponse34 but were not included because it is difficult to obtain comprehensive data Wediscuss further limitations in Appendix E

Currently governments across the world are seeking to keep R below 1 while minimising thesocial and economic costs of their interventions We hope that our results can guide policydecisions on which restrictions to lift and which NPIs to implement in any potential secondwave of infections Additionally our results show which areas of public life are most inneed of restructuring so that they can continue despite the pandemic However our resultsshould not be seen as the final answer on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies experimental trialssimulations and clinical experience

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

We thank Jacob Lagerros for operational support and for introducing some of the authorsto each other We thank Maksym Balatsko Marek Pukaj and Tomaacuteš Witzany for developingthe interactive website

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier MA Hartwick LChindelevitch designed the model and modelling experiments M Sharma AB StephensonT Gavenciak J Salvatier performed and analysed the modelling experiments J KulveitT Gavenciak JM Brauner conceived the research S Mindermann T Besiroglu J KulveitJM Brauner did the literature search JM Brauner S Mindermann M Sharma G LeechT Besiroglu V Mikulik wrote the manuscript All authors read and gave feedback on themanuscript and approved the final manuscript JM Brauner S Mindermann and M Sharmacontributed equally Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

3 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

4 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

5 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

6 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

7 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

8 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

9 Johns Hopkins University Center for Systems Science and Engineering COVID-19 DataRepository by the Center for Systems Science and Engineering (CSSE) at Johns HopkinsUniversity httpsgithubcomCSSEGISandDataCOVID-19 2020

10 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

11 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

12 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

13 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

14 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

15 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

19 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

20 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

28 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

29 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

30 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

31 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

32 Arnaud Fontanet et al ldquoCluster of COVID-19 in northern France A retrospective closedcohort studyrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041820071134

33 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 26 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile895356Weekly_COVID19_Surveillance_Report_w26pdf2020

34 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 21

Appendix A1 Detailed model description 21

Appendix A2 Technical Model Description 24

Appendix B Validation 28

Appendix B1 Unseen data 28

Appendix B2 Sensitivity Analysis 28

Appendix B3 Robustness to unobserved effects 31

Appendix C Additional sensitivity analyses and validation 33

Appendix C1 Role of NPI timing 33

Appendix C2 Sensitivity to data preprocessing 34

Appendix C3 Sensitivity to additional epidemiological assumptions 35

Appendix C4 Additional country exclusions 37

Appendix C5 Validation using predictions in excluded countries 38

Appendix C6 MCMC stability results 43

Appendix C7 Posterior predictive distributions 44

Appendix D Additional results 45

Appendix D1 Estimated R0 by country 45

Appendix D2 The individual effects of school and university closures 46

Appendix D3 Collinearity 46

Appendix D4 Correlations between effectiveness estimates 47

Appendix E Additional discussion of assumptions and limitations 49

Appendix E1 Limitations of the data 49

Appendix E2 Model limitations 49

Appendix F Overview of previous work 50

Appendix G Handling edge cases in the data collection 53

Appendix H References 57

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure A6 Model Overview Purple nodes are observed or have a fixed distribution We describe thediagram from bottom to top Each NPIrsquos effectiveness is characterised by αi which is independent ofthe country On each day a countryrsquos daily reproduction number Rt c only depends on that countryrsquosbase reproduction number R0c and the active NPIs (Φi t c ) Rt c is transformed to the daily growth rategt c which is used to compute the new infections N (C )

t c and N (D)t c that will turn into confirmed cases

and deaths respectively Finally the number of new confirmed cases Ct c and deaths Dt c is computedby convolution of N (middot)

t c with the respective delay distributions The same model structure is used forconfirmed cases and deaths The model combines both observations it splits all nodes above the dailygrowth rate gt c into separate branches for deaths and cases

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1 Themain difference is that we model both confirmed cases and deaths allowing us to leveragesignificantly more data Furthermore we do not assume a specific infection fatality rate(IFR) since we do not aim to infer the total number of COVID-19 infections The end of thissection details further adaptations which allow us to minimize assumptions about testingreporting and the IFR A list of all technical details is given in Appendix A2

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior (and hyperprior) distribution over R0c reflecting the wide disagree-ment of regional estimates of R02 We parameterize the effectiveness of NPI i assumed to

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

be same across countries and time with αi Each NPI is assumed to have an independentmultiplicative as on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We use a weakly informative symmetric prior αi simN (002) allowingfor both positive and negative effects because we presently cannot rule out that some NPIsdirectly or indirectly increase transmission

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c andRt c 3

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We assume that the generationinterval distribution is given by a gamma distribution with mean 667 days and standarddeviation 21 The mean is based on an Italian study4 which is deemed most relevant toEuropean countries5 and the standard deviation stems from a international set of countriessince in European countries it has only been estimated for the serial rather than generationinterval to our knowledge67 Using (A2) we can write g t c as g t c (Rt c ) (see Appendix A2)

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are backwards-inferred from the observation models for casesand deaths shown further below We assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)](A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)](A4)

aMany epidemiological models define growth rates as the exponent r in an exponential growth functionHere we use daily growth rates instead for ease of exposition These choices are mathematically equivalentNote that we adapted equation (29) in Wallinga amp Lipsitch3 to account for our choice

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross validation as no reference is available for it We did not tune any otheraspect of the modelmdashinstead we use choices from Flaxman et al1 or the most relevantavailable sources We evaluate five different values (σN isin 00501020304) fitting themodel on 35 countries each time and evaluating on a fixed randomly chosen validationset of 6 countries We select σN = 02 to maximize the log-likelihood on the validation setCross validation ensures a more calibrated model less likely to produce overconfident andunstable estimates10 The final model with σN = 02 is then evaluated on data from 20 held-out days in all countries which were not used to adjust any aspect of the model (FigureB7) We find that different values for σN produce very similar effectiveness estimates butlarger values lead to greater uncertainty and robustness (Figure C12 and Sharma et al9)

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation This delaydistribution is the sum of two independent gamma distributions the incubation periodand the delay from onset of symptoms to confirmation We use previously published andagreeing empirical distributions from China and Italy411ndash13 which sum up to a mean delayof 1035 days Finally the observed cases Ct c follow a negative binomial noise distributionwith mean C t c and an inferred dispersion parameter following Flaxman et al1

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ)

where PD (delay) is the distribution of the delay from infection to death It is also the sumof two independent gamma distributions the aforementioned incubation period and thedelay from onset of symptoms to death1114 which sum up to a mean delay of 229 daysFinally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and the same inferred dispersion parameter used for observed cases

bSince we treat new infections as a continuous number its initial value can (and often should) be between0 and 1

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Single and combined models To construct models which only use either confirmed cases ordeaths as observations we remove the variables corresponding to the disregarded observa-tions

Testing reporting and infection fatality rates Scaling all values of a time series by a con-stant does not change its growth rates The model is therefore invariant to the scale of theobservations and consequently to country-level differences in the IFR and the ascertainmentrate (the proportion of infected people who are subsequently reported positive) For exam-ple assume countries A and B differ only in their ascertainment rates Then our model willinfer a difference in N (C )

t c (Eq (A5)) but not in the growth rates g t c across A and B (Eq(A3)-(A4)) Accordingly the inferred NPI effectiveness will be identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas theintroduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimic changes in the ascertainment ratemdashnoise at time τ

affects all future casesmdashand allows for gradual multiplicative changes in the ascertainmentrate

We infer the unobserved variables in our model using Hamiltonian Monte-Carlo1516 (HMC)a standard MCMC sampling algorithm

Appendix A2 Technical Model Description

Variables are indexed by intervention i country c and day t All prior distributions areindependent

bull Data1 NPI Activations φi t c isin 012 Smoothed Observed Cases Ct c 3 Smoothed Observed Deaths D t c

bull Prior Distributions

cThis is only approximately true The negative binomial output distribution has a coefficient of variationdiminishing with its mean ie smaller observations are relatively more noisy and carry less weight Further-more whilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligibleeffect

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

1 Country-specific R0

R0c = Normal(R0κ) (A6)

R0 = 325 based on a meta analysis17 (A7)

κsim Half Normal(micro= 0σ= 05) (A8)

2 NPI Effectiveness

αi sim Normal(micro= 0σ=p02) (A9)

(A10)

3 Infection Initial Counts (uninformative priors)

N (C )0c = exp(ζ(C )

c ) (A11)

N (D)0c = exp(ζ(D)

c ) (A12)

ζ(C )c sim Normal(micro= 0σ= 50) (A13)

ζ(D)c sim Normal(micro= 0σ= 50) (A14)

(A15)

4 Observation Noise Dispersion Parameter

Ψsim Half Normal(micro= 0σ= 5) (A16)

bull Hyperparameters1 Infection Noise Scale σN = 02 (selected by cross-validation)

bull Epidemiological parameters1 Generation Interval Parameters The generation interval is assumed to have a

Gamma distribution with mean 667 and standard deviation 21 days4ndash6 This leadsto a distribution Gamma(α= 79β= 12)

2 Delay Distributions The time from infection to confirmation is assumed to bethe sum of the incubation period and the time taken from symptom onset to lab-oratory confirmation Therefore the time taken from infection to confirmationT (C ) is411ndash13

T (C ) sim Gamma(micro= 51σ

micro= 086)+Negative Binomial(micro= 525α= 157) (A17)

The time from infection to death is assumed to be the sum of the incubationperiod and the time taken from symptom onset to death Therefore the timetaken from infection to death T (D) is11114

T (D) sim Gamma(micro= 51σ

micro= 086)+Gamma(micro= 178

σ

micro= 045) (A18)

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 7: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Data collection

We collected data on the start and end date of NPI implementations from the start of thepandemic until the 30th of May 2020 Before collecting the data we experimented withseveral public NPI datasets finding that they were not complete enough for our modellingand contained incorrect datesc By focusing on a smaller set of countries and NPIs than thesedatasets we were able to enforce strong quality controls We used independent doubleentry and manually compared our data to public datasets for cross-checking

First two authors independently researched each country and entered the NPI data into sep-arate spreadsheets The researchers manually researched the dates using internet searchesthere was no automatic component in the data gathering process The average time spentresearching each country per researcher was 15 hours

Second the researchers independently compared their entries to the following public datasetsand if there were conflicts visited all primary sources to resolve the conflict the EFGNPIdatabase10 the Oxford COVID-19 Government Response Tracker11 and the mask4all dataset12

Third each country and NPI was again independently entered by one to three paid con-tractors who were provided with a detailed description of the NPIs and asked to includeprimary sources with their data A researcher then resolved any conflicts between this dataand one (but not both) of the spreadsheets

Finally the two independent spreadsheets were combined and all conflicts resolved by aresearcher The final dataset contains primary sources (government websites andor mediaarticles) for each entry

Data Preprocessing

Data on cases and deaths are noisy Many countries preferentially report deaths and cases oncertain days of the week For example there are days with zero newly confirmed cases eventhough there had been several hundred reported cases on the previous day We thereforesmooth the data using a moving average (on a linear scale) over 2 days into the past andfuture When the case count is small a large fraction of cases may be imported from other

cWe evaluated the following datasetsbull Epidemic Forecasting Global NPI Database10

bull Oxford COVID-19 Government Response Tracker (OxCGRT)11

bull ACAPS COVID19 Government Measures DatasetNote that these datasets are under continuous development Many of the mistakes found will already have beencorrected We know from our own experience that data collection can be very challenging We have the fullestrespect for the people behind these datasets In this paper we focus on a more limited set of countries and NPIsthan these datasets contain allowing us to ensure higher data quality in this subset Given our experience withpublic datasets and our data collection we encourage fellow COVID-19 researchers to independently verify thequality of public data they use if feasible

7

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

countries and the testing regime may change rapidly To prevent this from biasing ourmodel we neglect case numbers before a country has reached 100 confirmed cases anddeath numbers before a country has reached 10 deaths We include all preprocessing stepsin our sensitivity analysis (Appendix C2)

Model

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure 2 Model Overview Purple nodes are observed or have a fixed distribution From bottom to topThe effectiveness of intervention i is represented by αi On each day t a countryrsquos daily reproductionnumber Rt c depends on the countryrsquos basic reproduction number R0c and the active NPIs The activeNPIs are encoded by Φi t c which is 1 if NPI i is active in country c at time t and 0 otherwise Rt c istransformed into the daily growth rate gt c which also depends on the generation interval The growthrate is used to compute the new infections N (C )

t c and N (D)t c that will later be registered as confirmed cases

Ct c and deaths Dt c respectively after a delay Our model uses both death and case data it splits allnodes above the daily growth rate gt c into separate branches for deaths and confirmed cases

Our model uses case and death data from each country to lsquobackwardsrsquo infer the number ofnew infections at each point in time which is itself used to infer the reproduction numbersNPI effects are then estimated by relating the daily reproduction numbers to the activeNPIs across all days and countries This relatively simple lsquodata-drivenrsquo approach allows usto sidestep assumptions about contact patterns and intensity infectiousness of different agegroups and so forth that are typically required in modelling studies Our semi-mechanisticBayesian hierarchical model is based on that of Flaxman et al1 extended to use both caseand death data This increases the amount of data from which we can extract NPI effectsreduces distinct biases of case and death reporting and reduces the bias of only including

8

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

countries that have many deaths Additionally as we do not aim to infer the total number ofCOVID-19 infections we do not assume a specific infection fatality rate (IFR) or ascertain-ment rate (rate of testing) We proceed by summarising the model (Figure 2) A detaileddescription is given in Appendix A Code is available online here

The growth of the epidemic is determined by the time- and country-specific reproductionnumber Rt c which depends on a) the (unobserved) basic reproduction number R0c givenno active NPIs and b) the active NPIs at time t R0c accounts for all time-invariant factorsthat affect transmission in country c such as differences in demographics population den-sity culture and health systems13 We assume that the effect of each NPI on Rt c is stableacross countries and time The effectiveness of NPI i is represented by a parameter αi over which we place a symmetric prior with mean zero allowing both positive and negativeeffects Following Flaxman et al and others1ndash3 each NPIrsquos effect on Rt c is assumed toindependently affect Rt c as a multiplicative factor

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (1)

where φi ct = 1 indicates that NPI i is active in country c on day t (φi ct = 0 otherwise) andI is the number of NPIs The multiplicative effect encodes the plausible assumption thatNPIs have a smaller absolute effect when Rt c is already low We discuss the meaning ofeffectiveness estimates given NPI interactions in the Results section

In the early phase of an epidemic the number of new daily infections grows exponentiallyDuring exponential growth there is a one-to-one correspondence between the daily growthrate and Rt c 14 The correspondence depends on the generation interval (the time betweensuccessive infections in a chain of transmission) which we assume to have a Gamma dis-tribution with mean 667 days11516 We model the daily new infection count separatelyfor confirmed cases and deaths representing those infections which are later reported andthose which are later fatal However both infection numbers are assumed to grow at thesame daily rate in expectation allowing the use of both data sources to estimate each αi The infection numbers translate into reported confirmed cases and deaths after a stochasticdelay which is assumed to be equal across countries The delay is the sum of two indepen-dent gamma distributions assumed to be equal across countries the incubation period andthe delay from onset of symptoms to confirmation We use previously published empiricaldistributions from China and Italy16ndash19 which mutually agree and give a mean infection-to-confirmation delay of 1035 days Similarly the infection-to-death delay is the sum of theincubation period and the (gamma distributed) delay from onset of symptoms to death1720

which sum up to a mean delay of 229 days Finally both the reported deaths and casesfollow a negative binomial noise distribution an inferred dispersion parameter as in relatedNPI models13

Using a Markov chain Monte Carlo (MCMC) sampling algorithm21 this model infers poste-rior distributions of each NPIrsquos effectiveness while accounting for cross-country variationsin testing reporting and fatality rates However it relies on the key assumptions that NPIs

9

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

have equal effects across countries and time and that changes in R are due to the observedNPIs To analyse the extent to which modelling choices affect the results our sensitivityanalysis includes all epidemiological parameters prior distributions and many of the struc-tural assumptions introduced above (Appendix B2 and Appendix C) MCMC convergencestatistics are given in Appendix C

Results

NPI Effectiveness

Our model enables us to estimate the individual effectiveness of each NPI expressed asa percentage reduction in R As in related work1ndash3 this percentage reduction is modelledas constant over countries and time and independent of the other implemented NPIs Inpractice however NPI effectiveness may depend on other implemented NPIs and localcircumstances Thus our effectiveness estimates ought to be interpreted as the effectivenessaveraged over the contexts in which the NPI was implemented in our data5 Our results thusgive the average NPI effectiveness across typical situations that the NPIs were implementedin Figure 3 (bottom left) visualizes which NPIs typically co-occurred aiding interpretation

The mean percentage reduction in R (with 95 credible interval) associated with each NPIis as follows (Figure 3) mandating mask-wearing in (some) public spaces 2 (-14ndash16) limiting gatherings to 1000 people or less 2 (-20ndash22) to 100 people or less21 (1ndash39) to 10 people or less 36 (16ndash53) closing some high-risk businesses31 (13ndash46) closing most nonessential businesses 40 (22ndash55) closing schoolsand universities 39 (21ndash55) and issuing stay-at-home orders 18 (4ndash31)

Some NPIs frequently co-occur ie are collinear However we are able to isolate the ef-fects of individual NPIs since the collinearity is imperfect and our dataset is large For everypair of NPIs we observe one of them without the other for 748 country-days on average(Appendix D3) The minimum number of country-days for any NPI pair is 143 (for lim-iting gatherings to 1000 or 100 attendees) Additionally under excessive collinearity andinsufficient data to overcome it individual effectiveness estimates are highly sensitive tovariations in the data and model parameters22 High sensitivity prevented Flaxman et al1

who had a smaller dataset from disentangling NPI effects6 Our estimates are substantiallyless sensitive (see below) Finally the posterior correlations between the effectiveness esti-mates are weak suggesting manageable collinearity (Appendix D4)

Although the correlations between the individual estimates are weak we should take theminto account when evaluating combined effects of NPIs For example if two NPIs fre-quently co-occur there may be more certainty about the combined effect than about thetwo individual effects Figure 4 shows the combined effectiveness of the sets of NPIs thatare most common in our data All NPIs together reduce R by 82 (79ndash85) Acrossour countries the mean R without any NPIs (ie R0) is 38 matching the mean result ofFlaxman et al1 (Table D2 reports R0 for all countries) Starting from this number the esti-

10

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-100 -50 0 50 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spaces

Gatherings limited to1000 people or less

Gatherings limited to100 people or less

Gatherings limited to10 people or less

Some businessesclosed

Most nonessentialbusinesses closed

Schools and universitiesclosed

Stay-at-home order(with exemptions)

北 便

i

j

Frequency i Active Given j Active

25

50

75

100

0 500 1000 1500 2000 2500 3000

Days

Total Days Active

Figure 3 Top NPI effects The Figure shows the average percentage reductions in R as observed inour data (or in terms of the model the posterior marginal distributions of 1minusexp(minusαi )) with median50 and 95 credible intervals A negative 1 reduction refers to a 1 increase in R Cumulativeeffects are shown for hierarchical NPIs (gathering bans and business closures) ie the result for Mostnonessential businesses closed shows the cumulative effect of two NPIs with separate parameters andsymbols - closing some (high-risk) businesses and additionally closing most remaining (non-high-riskbut nonessential) businesses given that some businesses are already closed Finally we show the jointeffect of closing both schools and universities because the dates of school and university closures nearlyperfectly coincide in our data and we cannot meaningfully isolate their individual effects (AppendixD2) Bottom Left Conditional activation matrix Cell values indicate the frequency that NPI i (x-axis)was active given that NPI j (y-axis) was active Eg schools were always closed whenever a stay-at-home order was active (bottom row third column from the right) but not vice versa Bottom RightTotal number of days each NPI was active across all countries

11

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearingMost businesses closed Schools Closed Universities Closed Stay-at-home order

Some businessesclosed

Figure 4 Combined NPI effectiveness for the most common sets of NPIs in our data by size of the NPI setShaded regions denote 50 and 95 credible intervals Left Maximum R0 that can be reduced to below1 for each set of NPIs Right Predicted R after implementation of each set of NPIs assuming R0 = 38Readers can interactively explore the effects of all sets of NPIs at httpepidemicforecastingorgcalc

mated R can be reduced below 1 by closing schools and universities high-risk businessesand limiting gathering sizes Readers can interactively explore the effects of sets of NPIsat httpepidemicforecastingorgcalc A CSV file containing the joint effectiveness of allpossible NPI combinations is available online here

Validation

We perform a range of experiments to study the robustness and calibration of our NPI effec-tiveness estimates (Appendix B with further experiments in Appendix C) We analyse howthe model extrapolates to unseen countries and periods and perform multiple sensitivityanalyses Amongst other things we analyse how results change if we vary epidemiologicalparameters or vary the data (using only deaths or confirmed cases as observations ex-cluding countries from the data) To investigate our key assumptions we show results forseveral alternative models (structural sensitivity) analyze the role of NPI timing and ex-amine possible confounding of our estimates by unobserved factors influencing R Figure 5summarises these analyses by showing how each NPIrsquos effectiveness is ranked comparedto other NPIs and how its rank is distributed across all experiment conditions The strongagreement across these many analyses increases our confidence in the results while alsoshowing that the precise effectiveness estimates come with additional uncertainty

Discussion

We use a data-driven approach to estimate the effects of eight nonpharmaceutical interven-tions on COVID-19 transmission in 41 countries All eight NPIs together reduce R by 82

12

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

12345678NPI Median Effectiveness Ranking

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Global NPI Rank Stability

25

50

75

100

Figure 5 Ranking of NPIs by median effectiveness across all sensitivity analyses (15 sensitivity analyseswith a total of 96 experiment conditions) In each analysis NPI effects were estimated under severaldifferent plausible variations of the model or the data (Appendix B and Appendix C) The colourindicates in which fraction of all experiment conditions an NPI occupied a given rank The sensitivityanalyses aggregated in this Figure are shown in Figures B8 B9 C11 C12 C13 and B7A

(79ndash85) This finding is in strong agreement with the joint effect estimated in elevencountries by Flaxman et al1 and contributes to the mounting evidence that NPIs can beeffective at mitigating and suppressing outbreaks of COVID-19 Furthermore our resultssuggest that some NPIs outperform others While the exact effectiveness estimates varymildly the qualitative conclusions discussed below are robust across 15 sensitivity analyses

Business closures and gathering bans both seem effective at reducing COVID-19 transmis-sion Closing only high-risk businesses (mean reduction in R 31) appears only somewhatless effective than closing most nonessential businesses (40) making it the more promis-ing policy option in some circumstances Limiting gatherings to 10 people or less (36) wasmore effective than limits up to 100 (21) or 1000 people (2) This may reflect that smallgatherings are common As previously discussed we estimate the average additional effecteach NPI had in the contexts where it was implemented When countries introduced stay-at-home orders they nearly always also banned gatherings and closed schools universitiesand nonessential businesses if they had not done so already Flaxman et al1 and Hsianget al4 add the effect of these distinct NPIs to the effectiveness of stay-at-home orders andaccordingly find a large effect In contrast we and Banholzer et al3 isolate the additionaleffect of ordering the population to stay at home and instead find a smaller effect (18) Atypical country can reduce R to below 1 without a stay-at-home order (Figure 4) providedother NPIs stay active

Mandating mask-wearing in various public spaces had a small positive effect on averagein the countries we studied (2) This does not rule out that mask-wearing has a largereffect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaces

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

wearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier223 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 and open questionsabout different types of masks play a larger role potentially reducing effectiveness Whileour results cast doubt on reports that mask-wearing is the main determinant shaping acountryrsquos epidemic23 the policy still seems promising given all available evidence due toits comparatively low economic and social costs Its effectiveness may increase as otherNPIs are lifted and public interactions return

We find a surprisingly large effect for school and university closures an average 39 re-duction in R This finding is remarkably robust across different model structures variationsin the data and epidemiological assumptions (Figure B7) It remains robust when control-ling for NPIs excluded from our study and the onset time of major government intervention(Figures B9 and C10) Since school and university closures almost perfectly coincide inthe countries we study an approach such as ours cannot distinguish their individual effects(Appendix D2) Furthermore it cannot distinguish direct and indirect effects such asforcing parents to stay at home or causing broader behaviour changes by increasing publicconcern

Previous evidence on school and university closures is mixed1326 Early data suggest thatchildren and young adults are equally susceptible to infection but have a notably lowerobserved case rate than older adultsmdashwhether this is due to school and university closuresremains unknown27ndash29 Although infected young people are often asymptomatic they ap-pear to shed similar amounts of virus as older people3031 and might therefore circulatethe infection to higher-risk demographics unknowingly This also limits our ability to detectlarge outbreaks in educational facilities which closed in nearly all countries before suchdetection was feasible (with exceptions32) As outbreaks detected in UK schools are rapidlyincreasing33 this topic merits further study

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset5

and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B2) However this cannot give final certainty Investigating the role of unob-served effects is an important topic for future investigations Third our results cannot beused without qualification to predict the effect of lifting NPIs For example closing schoolsand universities seems to have greatly reduced transmission but this does not mean that re-opening them will cause infections to soar Educational institutions can (and do) implementsafety measures such as reduced class sizes Further work is needed to analyse the effectsof reopenings our collected data may be instrumental Fourth while we included more

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

NPIs than previous work (Table F4) several promising NPIs were excluded For exampletesting tracing and case isolation may be an important part of a cost-effective epidemicresponse34 but were not included because it is difficult to obtain comprehensive data Wediscuss further limitations in Appendix E

Currently governments across the world are seeking to keep R below 1 while minimising thesocial and economic costs of their interventions We hope that our results can guide policydecisions on which restrictions to lift and which NPIs to implement in any potential secondwave of infections Additionally our results show which areas of public life are most inneed of restructuring so that they can continue despite the pandemic However our resultsshould not be seen as the final answer on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies experimental trialssimulations and clinical experience

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

We thank Jacob Lagerros for operational support and for introducing some of the authorsto each other We thank Maksym Balatsko Marek Pukaj and Tomaacuteš Witzany for developingthe interactive website

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier MA Hartwick LChindelevitch designed the model and modelling experiments M Sharma AB StephensonT Gavenciak J Salvatier performed and analysed the modelling experiments J KulveitT Gavenciak JM Brauner conceived the research S Mindermann T Besiroglu J KulveitJM Brauner did the literature search JM Brauner S Mindermann M Sharma G LeechT Besiroglu V Mikulik wrote the manuscript All authors read and gave feedback on themanuscript and approved the final manuscript JM Brauner S Mindermann and M Sharmacontributed equally Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

3 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

4 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

5 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

6 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

7 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

8 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

9 Johns Hopkins University Center for Systems Science and Engineering COVID-19 DataRepository by the Center for Systems Science and Engineering (CSSE) at Johns HopkinsUniversity httpsgithubcomCSSEGISandDataCOVID-19 2020

10 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

11 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

12 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

13 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

14 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

15 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

19 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

20 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

28 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

29 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

30 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

31 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

32 Arnaud Fontanet et al ldquoCluster of COVID-19 in northern France A retrospective closedcohort studyrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041820071134

33 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 26 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile895356Weekly_COVID19_Surveillance_Report_w26pdf2020

34 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 21

Appendix A1 Detailed model description 21

Appendix A2 Technical Model Description 24

Appendix B Validation 28

Appendix B1 Unseen data 28

Appendix B2 Sensitivity Analysis 28

Appendix B3 Robustness to unobserved effects 31

Appendix C Additional sensitivity analyses and validation 33

Appendix C1 Role of NPI timing 33

Appendix C2 Sensitivity to data preprocessing 34

Appendix C3 Sensitivity to additional epidemiological assumptions 35

Appendix C4 Additional country exclusions 37

Appendix C5 Validation using predictions in excluded countries 38

Appendix C6 MCMC stability results 43

Appendix C7 Posterior predictive distributions 44

Appendix D Additional results 45

Appendix D1 Estimated R0 by country 45

Appendix D2 The individual effects of school and university closures 46

Appendix D3 Collinearity 46

Appendix D4 Correlations between effectiveness estimates 47

Appendix E Additional discussion of assumptions and limitations 49

Appendix E1 Limitations of the data 49

Appendix E2 Model limitations 49

Appendix F Overview of previous work 50

Appendix G Handling edge cases in the data collection 53

Appendix H References 57

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure A6 Model Overview Purple nodes are observed or have a fixed distribution We describe thediagram from bottom to top Each NPIrsquos effectiveness is characterised by αi which is independent ofthe country On each day a countryrsquos daily reproduction number Rt c only depends on that countryrsquosbase reproduction number R0c and the active NPIs (Φi t c ) Rt c is transformed to the daily growth rategt c which is used to compute the new infections N (C )

t c and N (D)t c that will turn into confirmed cases

and deaths respectively Finally the number of new confirmed cases Ct c and deaths Dt c is computedby convolution of N (middot)

t c with the respective delay distributions The same model structure is used forconfirmed cases and deaths The model combines both observations it splits all nodes above the dailygrowth rate gt c into separate branches for deaths and cases

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1 Themain difference is that we model both confirmed cases and deaths allowing us to leveragesignificantly more data Furthermore we do not assume a specific infection fatality rate(IFR) since we do not aim to infer the total number of COVID-19 infections The end of thissection details further adaptations which allow us to minimize assumptions about testingreporting and the IFR A list of all technical details is given in Appendix A2

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior (and hyperprior) distribution over R0c reflecting the wide disagree-ment of regional estimates of R02 We parameterize the effectiveness of NPI i assumed to

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

be same across countries and time with αi Each NPI is assumed to have an independentmultiplicative as on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We use a weakly informative symmetric prior αi simN (002) allowingfor both positive and negative effects because we presently cannot rule out that some NPIsdirectly or indirectly increase transmission

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c andRt c 3

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We assume that the generationinterval distribution is given by a gamma distribution with mean 667 days and standarddeviation 21 The mean is based on an Italian study4 which is deemed most relevant toEuropean countries5 and the standard deviation stems from a international set of countriessince in European countries it has only been estimated for the serial rather than generationinterval to our knowledge67 Using (A2) we can write g t c as g t c (Rt c ) (see Appendix A2)

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are backwards-inferred from the observation models for casesand deaths shown further below We assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)](A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)](A4)

aMany epidemiological models define growth rates as the exponent r in an exponential growth functionHere we use daily growth rates instead for ease of exposition These choices are mathematically equivalentNote that we adapted equation (29) in Wallinga amp Lipsitch3 to account for our choice

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross validation as no reference is available for it We did not tune any otheraspect of the modelmdashinstead we use choices from Flaxman et al1 or the most relevantavailable sources We evaluate five different values (σN isin 00501020304) fitting themodel on 35 countries each time and evaluating on a fixed randomly chosen validationset of 6 countries We select σN = 02 to maximize the log-likelihood on the validation setCross validation ensures a more calibrated model less likely to produce overconfident andunstable estimates10 The final model with σN = 02 is then evaluated on data from 20 held-out days in all countries which were not used to adjust any aspect of the model (FigureB7) We find that different values for σN produce very similar effectiveness estimates butlarger values lead to greater uncertainty and robustness (Figure C12 and Sharma et al9)

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation This delaydistribution is the sum of two independent gamma distributions the incubation periodand the delay from onset of symptoms to confirmation We use previously published andagreeing empirical distributions from China and Italy411ndash13 which sum up to a mean delayof 1035 days Finally the observed cases Ct c follow a negative binomial noise distributionwith mean C t c and an inferred dispersion parameter following Flaxman et al1

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ)

where PD (delay) is the distribution of the delay from infection to death It is also the sumof two independent gamma distributions the aforementioned incubation period and thedelay from onset of symptoms to death1114 which sum up to a mean delay of 229 daysFinally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and the same inferred dispersion parameter used for observed cases

bSince we treat new infections as a continuous number its initial value can (and often should) be between0 and 1

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Single and combined models To construct models which only use either confirmed cases ordeaths as observations we remove the variables corresponding to the disregarded observa-tions

Testing reporting and infection fatality rates Scaling all values of a time series by a con-stant does not change its growth rates The model is therefore invariant to the scale of theobservations and consequently to country-level differences in the IFR and the ascertainmentrate (the proportion of infected people who are subsequently reported positive) For exam-ple assume countries A and B differ only in their ascertainment rates Then our model willinfer a difference in N (C )

t c (Eq (A5)) but not in the growth rates g t c across A and B (Eq(A3)-(A4)) Accordingly the inferred NPI effectiveness will be identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas theintroduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimic changes in the ascertainment ratemdashnoise at time τ

affects all future casesmdashand allows for gradual multiplicative changes in the ascertainmentrate

We infer the unobserved variables in our model using Hamiltonian Monte-Carlo1516 (HMC)a standard MCMC sampling algorithm

Appendix A2 Technical Model Description

Variables are indexed by intervention i country c and day t All prior distributions areindependent

bull Data1 NPI Activations φi t c isin 012 Smoothed Observed Cases Ct c 3 Smoothed Observed Deaths D t c

bull Prior Distributions

cThis is only approximately true The negative binomial output distribution has a coefficient of variationdiminishing with its mean ie smaller observations are relatively more noisy and carry less weight Further-more whilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligibleeffect

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

1 Country-specific R0

R0c = Normal(R0κ) (A6)

R0 = 325 based on a meta analysis17 (A7)

κsim Half Normal(micro= 0σ= 05) (A8)

2 NPI Effectiveness

αi sim Normal(micro= 0σ=p02) (A9)

(A10)

3 Infection Initial Counts (uninformative priors)

N (C )0c = exp(ζ(C )

c ) (A11)

N (D)0c = exp(ζ(D)

c ) (A12)

ζ(C )c sim Normal(micro= 0σ= 50) (A13)

ζ(D)c sim Normal(micro= 0σ= 50) (A14)

(A15)

4 Observation Noise Dispersion Parameter

Ψsim Half Normal(micro= 0σ= 5) (A16)

bull Hyperparameters1 Infection Noise Scale σN = 02 (selected by cross-validation)

bull Epidemiological parameters1 Generation Interval Parameters The generation interval is assumed to have a

Gamma distribution with mean 667 and standard deviation 21 days4ndash6 This leadsto a distribution Gamma(α= 79β= 12)

2 Delay Distributions The time from infection to confirmation is assumed to bethe sum of the incubation period and the time taken from symptom onset to lab-oratory confirmation Therefore the time taken from infection to confirmationT (C ) is411ndash13

T (C ) sim Gamma(micro= 51σ

micro= 086)+Negative Binomial(micro= 525α= 157) (A17)

The time from infection to death is assumed to be the sum of the incubationperiod and the time taken from symptom onset to death Therefore the timetaken from infection to death T (D) is11114

T (D) sim Gamma(micro= 51σ

micro= 086)+Gamma(micro= 178

σ

micro= 045) (A18)

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 8: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

countries and the testing regime may change rapidly To prevent this from biasing ourmodel we neglect case numbers before a country has reached 100 confirmed cases anddeath numbers before a country has reached 10 deaths We include all preprocessing stepsin our sensitivity analysis (Appendix C2)

Model

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure 2 Model Overview Purple nodes are observed or have a fixed distribution From bottom to topThe effectiveness of intervention i is represented by αi On each day t a countryrsquos daily reproductionnumber Rt c depends on the countryrsquos basic reproduction number R0c and the active NPIs The activeNPIs are encoded by Φi t c which is 1 if NPI i is active in country c at time t and 0 otherwise Rt c istransformed into the daily growth rate gt c which also depends on the generation interval The growthrate is used to compute the new infections N (C )

t c and N (D)t c that will later be registered as confirmed cases

Ct c and deaths Dt c respectively after a delay Our model uses both death and case data it splits allnodes above the daily growth rate gt c into separate branches for deaths and confirmed cases

Our model uses case and death data from each country to lsquobackwardsrsquo infer the number ofnew infections at each point in time which is itself used to infer the reproduction numbersNPI effects are then estimated by relating the daily reproduction numbers to the activeNPIs across all days and countries This relatively simple lsquodata-drivenrsquo approach allows usto sidestep assumptions about contact patterns and intensity infectiousness of different agegroups and so forth that are typically required in modelling studies Our semi-mechanisticBayesian hierarchical model is based on that of Flaxman et al1 extended to use both caseand death data This increases the amount of data from which we can extract NPI effectsreduces distinct biases of case and death reporting and reduces the bias of only including

8

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

countries that have many deaths Additionally as we do not aim to infer the total number ofCOVID-19 infections we do not assume a specific infection fatality rate (IFR) or ascertain-ment rate (rate of testing) We proceed by summarising the model (Figure 2) A detaileddescription is given in Appendix A Code is available online here

The growth of the epidemic is determined by the time- and country-specific reproductionnumber Rt c which depends on a) the (unobserved) basic reproduction number R0c givenno active NPIs and b) the active NPIs at time t R0c accounts for all time-invariant factorsthat affect transmission in country c such as differences in demographics population den-sity culture and health systems13 We assume that the effect of each NPI on Rt c is stableacross countries and time The effectiveness of NPI i is represented by a parameter αi over which we place a symmetric prior with mean zero allowing both positive and negativeeffects Following Flaxman et al and others1ndash3 each NPIrsquos effect on Rt c is assumed toindependently affect Rt c as a multiplicative factor

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (1)

where φi ct = 1 indicates that NPI i is active in country c on day t (φi ct = 0 otherwise) andI is the number of NPIs The multiplicative effect encodes the plausible assumption thatNPIs have a smaller absolute effect when Rt c is already low We discuss the meaning ofeffectiveness estimates given NPI interactions in the Results section

In the early phase of an epidemic the number of new daily infections grows exponentiallyDuring exponential growth there is a one-to-one correspondence between the daily growthrate and Rt c 14 The correspondence depends on the generation interval (the time betweensuccessive infections in a chain of transmission) which we assume to have a Gamma dis-tribution with mean 667 days11516 We model the daily new infection count separatelyfor confirmed cases and deaths representing those infections which are later reported andthose which are later fatal However both infection numbers are assumed to grow at thesame daily rate in expectation allowing the use of both data sources to estimate each αi The infection numbers translate into reported confirmed cases and deaths after a stochasticdelay which is assumed to be equal across countries The delay is the sum of two indepen-dent gamma distributions assumed to be equal across countries the incubation period andthe delay from onset of symptoms to confirmation We use previously published empiricaldistributions from China and Italy16ndash19 which mutually agree and give a mean infection-to-confirmation delay of 1035 days Similarly the infection-to-death delay is the sum of theincubation period and the (gamma distributed) delay from onset of symptoms to death1720

which sum up to a mean delay of 229 days Finally both the reported deaths and casesfollow a negative binomial noise distribution an inferred dispersion parameter as in relatedNPI models13

Using a Markov chain Monte Carlo (MCMC) sampling algorithm21 this model infers poste-rior distributions of each NPIrsquos effectiveness while accounting for cross-country variationsin testing reporting and fatality rates However it relies on the key assumptions that NPIs

9

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

have equal effects across countries and time and that changes in R are due to the observedNPIs To analyse the extent to which modelling choices affect the results our sensitivityanalysis includes all epidemiological parameters prior distributions and many of the struc-tural assumptions introduced above (Appendix B2 and Appendix C) MCMC convergencestatistics are given in Appendix C

Results

NPI Effectiveness

Our model enables us to estimate the individual effectiveness of each NPI expressed asa percentage reduction in R As in related work1ndash3 this percentage reduction is modelledas constant over countries and time and independent of the other implemented NPIs Inpractice however NPI effectiveness may depend on other implemented NPIs and localcircumstances Thus our effectiveness estimates ought to be interpreted as the effectivenessaveraged over the contexts in which the NPI was implemented in our data5 Our results thusgive the average NPI effectiveness across typical situations that the NPIs were implementedin Figure 3 (bottom left) visualizes which NPIs typically co-occurred aiding interpretation

The mean percentage reduction in R (with 95 credible interval) associated with each NPIis as follows (Figure 3) mandating mask-wearing in (some) public spaces 2 (-14ndash16) limiting gatherings to 1000 people or less 2 (-20ndash22) to 100 people or less21 (1ndash39) to 10 people or less 36 (16ndash53) closing some high-risk businesses31 (13ndash46) closing most nonessential businesses 40 (22ndash55) closing schoolsand universities 39 (21ndash55) and issuing stay-at-home orders 18 (4ndash31)

Some NPIs frequently co-occur ie are collinear However we are able to isolate the ef-fects of individual NPIs since the collinearity is imperfect and our dataset is large For everypair of NPIs we observe one of them without the other for 748 country-days on average(Appendix D3) The minimum number of country-days for any NPI pair is 143 (for lim-iting gatherings to 1000 or 100 attendees) Additionally under excessive collinearity andinsufficient data to overcome it individual effectiveness estimates are highly sensitive tovariations in the data and model parameters22 High sensitivity prevented Flaxman et al1

who had a smaller dataset from disentangling NPI effects6 Our estimates are substantiallyless sensitive (see below) Finally the posterior correlations between the effectiveness esti-mates are weak suggesting manageable collinearity (Appendix D4)

Although the correlations between the individual estimates are weak we should take theminto account when evaluating combined effects of NPIs For example if two NPIs fre-quently co-occur there may be more certainty about the combined effect than about thetwo individual effects Figure 4 shows the combined effectiveness of the sets of NPIs thatare most common in our data All NPIs together reduce R by 82 (79ndash85) Acrossour countries the mean R without any NPIs (ie R0) is 38 matching the mean result ofFlaxman et al1 (Table D2 reports R0 for all countries) Starting from this number the esti-

10

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-100 -50 0 50 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spaces

Gatherings limited to1000 people or less

Gatherings limited to100 people or less

Gatherings limited to10 people or less

Some businessesclosed

Most nonessentialbusinesses closed

Schools and universitiesclosed

Stay-at-home order(with exemptions)

北 便

i

j

Frequency i Active Given j Active

25

50

75

100

0 500 1000 1500 2000 2500 3000

Days

Total Days Active

Figure 3 Top NPI effects The Figure shows the average percentage reductions in R as observed inour data (or in terms of the model the posterior marginal distributions of 1minusexp(minusαi )) with median50 and 95 credible intervals A negative 1 reduction refers to a 1 increase in R Cumulativeeffects are shown for hierarchical NPIs (gathering bans and business closures) ie the result for Mostnonessential businesses closed shows the cumulative effect of two NPIs with separate parameters andsymbols - closing some (high-risk) businesses and additionally closing most remaining (non-high-riskbut nonessential) businesses given that some businesses are already closed Finally we show the jointeffect of closing both schools and universities because the dates of school and university closures nearlyperfectly coincide in our data and we cannot meaningfully isolate their individual effects (AppendixD2) Bottom Left Conditional activation matrix Cell values indicate the frequency that NPI i (x-axis)was active given that NPI j (y-axis) was active Eg schools were always closed whenever a stay-at-home order was active (bottom row third column from the right) but not vice versa Bottom RightTotal number of days each NPI was active across all countries

11

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearingMost businesses closed Schools Closed Universities Closed Stay-at-home order

Some businessesclosed

Figure 4 Combined NPI effectiveness for the most common sets of NPIs in our data by size of the NPI setShaded regions denote 50 and 95 credible intervals Left Maximum R0 that can be reduced to below1 for each set of NPIs Right Predicted R after implementation of each set of NPIs assuming R0 = 38Readers can interactively explore the effects of all sets of NPIs at httpepidemicforecastingorgcalc

mated R can be reduced below 1 by closing schools and universities high-risk businessesand limiting gathering sizes Readers can interactively explore the effects of sets of NPIsat httpepidemicforecastingorgcalc A CSV file containing the joint effectiveness of allpossible NPI combinations is available online here

Validation

We perform a range of experiments to study the robustness and calibration of our NPI effec-tiveness estimates (Appendix B with further experiments in Appendix C) We analyse howthe model extrapolates to unseen countries and periods and perform multiple sensitivityanalyses Amongst other things we analyse how results change if we vary epidemiologicalparameters or vary the data (using only deaths or confirmed cases as observations ex-cluding countries from the data) To investigate our key assumptions we show results forseveral alternative models (structural sensitivity) analyze the role of NPI timing and ex-amine possible confounding of our estimates by unobserved factors influencing R Figure 5summarises these analyses by showing how each NPIrsquos effectiveness is ranked comparedto other NPIs and how its rank is distributed across all experiment conditions The strongagreement across these many analyses increases our confidence in the results while alsoshowing that the precise effectiveness estimates come with additional uncertainty

Discussion

We use a data-driven approach to estimate the effects of eight nonpharmaceutical interven-tions on COVID-19 transmission in 41 countries All eight NPIs together reduce R by 82

12

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

12345678NPI Median Effectiveness Ranking

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Global NPI Rank Stability

25

50

75

100

Figure 5 Ranking of NPIs by median effectiveness across all sensitivity analyses (15 sensitivity analyseswith a total of 96 experiment conditions) In each analysis NPI effects were estimated under severaldifferent plausible variations of the model or the data (Appendix B and Appendix C) The colourindicates in which fraction of all experiment conditions an NPI occupied a given rank The sensitivityanalyses aggregated in this Figure are shown in Figures B8 B9 C11 C12 C13 and B7A

(79ndash85) This finding is in strong agreement with the joint effect estimated in elevencountries by Flaxman et al1 and contributes to the mounting evidence that NPIs can beeffective at mitigating and suppressing outbreaks of COVID-19 Furthermore our resultssuggest that some NPIs outperform others While the exact effectiveness estimates varymildly the qualitative conclusions discussed below are robust across 15 sensitivity analyses

Business closures and gathering bans both seem effective at reducing COVID-19 transmis-sion Closing only high-risk businesses (mean reduction in R 31) appears only somewhatless effective than closing most nonessential businesses (40) making it the more promis-ing policy option in some circumstances Limiting gatherings to 10 people or less (36) wasmore effective than limits up to 100 (21) or 1000 people (2) This may reflect that smallgatherings are common As previously discussed we estimate the average additional effecteach NPI had in the contexts where it was implemented When countries introduced stay-at-home orders they nearly always also banned gatherings and closed schools universitiesand nonessential businesses if they had not done so already Flaxman et al1 and Hsianget al4 add the effect of these distinct NPIs to the effectiveness of stay-at-home orders andaccordingly find a large effect In contrast we and Banholzer et al3 isolate the additionaleffect of ordering the population to stay at home and instead find a smaller effect (18) Atypical country can reduce R to below 1 without a stay-at-home order (Figure 4) providedother NPIs stay active

Mandating mask-wearing in various public spaces had a small positive effect on averagein the countries we studied (2) This does not rule out that mask-wearing has a largereffect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaces

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

wearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier223 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 and open questionsabout different types of masks play a larger role potentially reducing effectiveness Whileour results cast doubt on reports that mask-wearing is the main determinant shaping acountryrsquos epidemic23 the policy still seems promising given all available evidence due toits comparatively low economic and social costs Its effectiveness may increase as otherNPIs are lifted and public interactions return

We find a surprisingly large effect for school and university closures an average 39 re-duction in R This finding is remarkably robust across different model structures variationsin the data and epidemiological assumptions (Figure B7) It remains robust when control-ling for NPIs excluded from our study and the onset time of major government intervention(Figures B9 and C10) Since school and university closures almost perfectly coincide inthe countries we study an approach such as ours cannot distinguish their individual effects(Appendix D2) Furthermore it cannot distinguish direct and indirect effects such asforcing parents to stay at home or causing broader behaviour changes by increasing publicconcern

Previous evidence on school and university closures is mixed1326 Early data suggest thatchildren and young adults are equally susceptible to infection but have a notably lowerobserved case rate than older adultsmdashwhether this is due to school and university closuresremains unknown27ndash29 Although infected young people are often asymptomatic they ap-pear to shed similar amounts of virus as older people3031 and might therefore circulatethe infection to higher-risk demographics unknowingly This also limits our ability to detectlarge outbreaks in educational facilities which closed in nearly all countries before suchdetection was feasible (with exceptions32) As outbreaks detected in UK schools are rapidlyincreasing33 this topic merits further study

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset5

and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B2) However this cannot give final certainty Investigating the role of unob-served effects is an important topic for future investigations Third our results cannot beused without qualification to predict the effect of lifting NPIs For example closing schoolsand universities seems to have greatly reduced transmission but this does not mean that re-opening them will cause infections to soar Educational institutions can (and do) implementsafety measures such as reduced class sizes Further work is needed to analyse the effectsof reopenings our collected data may be instrumental Fourth while we included more

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

NPIs than previous work (Table F4) several promising NPIs were excluded For exampletesting tracing and case isolation may be an important part of a cost-effective epidemicresponse34 but were not included because it is difficult to obtain comprehensive data Wediscuss further limitations in Appendix E

Currently governments across the world are seeking to keep R below 1 while minimising thesocial and economic costs of their interventions We hope that our results can guide policydecisions on which restrictions to lift and which NPIs to implement in any potential secondwave of infections Additionally our results show which areas of public life are most inneed of restructuring so that they can continue despite the pandemic However our resultsshould not be seen as the final answer on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies experimental trialssimulations and clinical experience

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

We thank Jacob Lagerros for operational support and for introducing some of the authorsto each other We thank Maksym Balatsko Marek Pukaj and Tomaacuteš Witzany for developingthe interactive website

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier MA Hartwick LChindelevitch designed the model and modelling experiments M Sharma AB StephensonT Gavenciak J Salvatier performed and analysed the modelling experiments J KulveitT Gavenciak JM Brauner conceived the research S Mindermann T Besiroglu J KulveitJM Brauner did the literature search JM Brauner S Mindermann M Sharma G LeechT Besiroglu V Mikulik wrote the manuscript All authors read and gave feedback on themanuscript and approved the final manuscript JM Brauner S Mindermann and M Sharmacontributed equally Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

3 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

4 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

5 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

6 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

7 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

8 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

9 Johns Hopkins University Center for Systems Science and Engineering COVID-19 DataRepository by the Center for Systems Science and Engineering (CSSE) at Johns HopkinsUniversity httpsgithubcomCSSEGISandDataCOVID-19 2020

10 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

11 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

12 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

13 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

14 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

15 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

19 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

20 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

28 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

29 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

30 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

31 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

32 Arnaud Fontanet et al ldquoCluster of COVID-19 in northern France A retrospective closedcohort studyrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041820071134

33 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 26 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile895356Weekly_COVID19_Surveillance_Report_w26pdf2020

34 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 21

Appendix A1 Detailed model description 21

Appendix A2 Technical Model Description 24

Appendix B Validation 28

Appendix B1 Unseen data 28

Appendix B2 Sensitivity Analysis 28

Appendix B3 Robustness to unobserved effects 31

Appendix C Additional sensitivity analyses and validation 33

Appendix C1 Role of NPI timing 33

Appendix C2 Sensitivity to data preprocessing 34

Appendix C3 Sensitivity to additional epidemiological assumptions 35

Appendix C4 Additional country exclusions 37

Appendix C5 Validation using predictions in excluded countries 38

Appendix C6 MCMC stability results 43

Appendix C7 Posterior predictive distributions 44

Appendix D Additional results 45

Appendix D1 Estimated R0 by country 45

Appendix D2 The individual effects of school and university closures 46

Appendix D3 Collinearity 46

Appendix D4 Correlations between effectiveness estimates 47

Appendix E Additional discussion of assumptions and limitations 49

Appendix E1 Limitations of the data 49

Appendix E2 Model limitations 49

Appendix F Overview of previous work 50

Appendix G Handling edge cases in the data collection 53

Appendix H References 57

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure A6 Model Overview Purple nodes are observed or have a fixed distribution We describe thediagram from bottom to top Each NPIrsquos effectiveness is characterised by αi which is independent ofthe country On each day a countryrsquos daily reproduction number Rt c only depends on that countryrsquosbase reproduction number R0c and the active NPIs (Φi t c ) Rt c is transformed to the daily growth rategt c which is used to compute the new infections N (C )

t c and N (D)t c that will turn into confirmed cases

and deaths respectively Finally the number of new confirmed cases Ct c and deaths Dt c is computedby convolution of N (middot)

t c with the respective delay distributions The same model structure is used forconfirmed cases and deaths The model combines both observations it splits all nodes above the dailygrowth rate gt c into separate branches for deaths and cases

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1 Themain difference is that we model both confirmed cases and deaths allowing us to leveragesignificantly more data Furthermore we do not assume a specific infection fatality rate(IFR) since we do not aim to infer the total number of COVID-19 infections The end of thissection details further adaptations which allow us to minimize assumptions about testingreporting and the IFR A list of all technical details is given in Appendix A2

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior (and hyperprior) distribution over R0c reflecting the wide disagree-ment of regional estimates of R02 We parameterize the effectiveness of NPI i assumed to

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

be same across countries and time with αi Each NPI is assumed to have an independentmultiplicative as on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We use a weakly informative symmetric prior αi simN (002) allowingfor both positive and negative effects because we presently cannot rule out that some NPIsdirectly or indirectly increase transmission

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c andRt c 3

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We assume that the generationinterval distribution is given by a gamma distribution with mean 667 days and standarddeviation 21 The mean is based on an Italian study4 which is deemed most relevant toEuropean countries5 and the standard deviation stems from a international set of countriessince in European countries it has only been estimated for the serial rather than generationinterval to our knowledge67 Using (A2) we can write g t c as g t c (Rt c ) (see Appendix A2)

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are backwards-inferred from the observation models for casesand deaths shown further below We assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)](A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)](A4)

aMany epidemiological models define growth rates as the exponent r in an exponential growth functionHere we use daily growth rates instead for ease of exposition These choices are mathematically equivalentNote that we adapted equation (29) in Wallinga amp Lipsitch3 to account for our choice

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross validation as no reference is available for it We did not tune any otheraspect of the modelmdashinstead we use choices from Flaxman et al1 or the most relevantavailable sources We evaluate five different values (σN isin 00501020304) fitting themodel on 35 countries each time and evaluating on a fixed randomly chosen validationset of 6 countries We select σN = 02 to maximize the log-likelihood on the validation setCross validation ensures a more calibrated model less likely to produce overconfident andunstable estimates10 The final model with σN = 02 is then evaluated on data from 20 held-out days in all countries which were not used to adjust any aspect of the model (FigureB7) We find that different values for σN produce very similar effectiveness estimates butlarger values lead to greater uncertainty and robustness (Figure C12 and Sharma et al9)

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation This delaydistribution is the sum of two independent gamma distributions the incubation periodand the delay from onset of symptoms to confirmation We use previously published andagreeing empirical distributions from China and Italy411ndash13 which sum up to a mean delayof 1035 days Finally the observed cases Ct c follow a negative binomial noise distributionwith mean C t c and an inferred dispersion parameter following Flaxman et al1

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ)

where PD (delay) is the distribution of the delay from infection to death It is also the sumof two independent gamma distributions the aforementioned incubation period and thedelay from onset of symptoms to death1114 which sum up to a mean delay of 229 daysFinally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and the same inferred dispersion parameter used for observed cases

bSince we treat new infections as a continuous number its initial value can (and often should) be between0 and 1

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Single and combined models To construct models which only use either confirmed cases ordeaths as observations we remove the variables corresponding to the disregarded observa-tions

Testing reporting and infection fatality rates Scaling all values of a time series by a con-stant does not change its growth rates The model is therefore invariant to the scale of theobservations and consequently to country-level differences in the IFR and the ascertainmentrate (the proportion of infected people who are subsequently reported positive) For exam-ple assume countries A and B differ only in their ascertainment rates Then our model willinfer a difference in N (C )

t c (Eq (A5)) but not in the growth rates g t c across A and B (Eq(A3)-(A4)) Accordingly the inferred NPI effectiveness will be identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas theintroduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimic changes in the ascertainment ratemdashnoise at time τ

affects all future casesmdashand allows for gradual multiplicative changes in the ascertainmentrate

We infer the unobserved variables in our model using Hamiltonian Monte-Carlo1516 (HMC)a standard MCMC sampling algorithm

Appendix A2 Technical Model Description

Variables are indexed by intervention i country c and day t All prior distributions areindependent

bull Data1 NPI Activations φi t c isin 012 Smoothed Observed Cases Ct c 3 Smoothed Observed Deaths D t c

bull Prior Distributions

cThis is only approximately true The negative binomial output distribution has a coefficient of variationdiminishing with its mean ie smaller observations are relatively more noisy and carry less weight Further-more whilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligibleeffect

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

1 Country-specific R0

R0c = Normal(R0κ) (A6)

R0 = 325 based on a meta analysis17 (A7)

κsim Half Normal(micro= 0σ= 05) (A8)

2 NPI Effectiveness

αi sim Normal(micro= 0σ=p02) (A9)

(A10)

3 Infection Initial Counts (uninformative priors)

N (C )0c = exp(ζ(C )

c ) (A11)

N (D)0c = exp(ζ(D)

c ) (A12)

ζ(C )c sim Normal(micro= 0σ= 50) (A13)

ζ(D)c sim Normal(micro= 0σ= 50) (A14)

(A15)

4 Observation Noise Dispersion Parameter

Ψsim Half Normal(micro= 0σ= 5) (A16)

bull Hyperparameters1 Infection Noise Scale σN = 02 (selected by cross-validation)

bull Epidemiological parameters1 Generation Interval Parameters The generation interval is assumed to have a

Gamma distribution with mean 667 and standard deviation 21 days4ndash6 This leadsto a distribution Gamma(α= 79β= 12)

2 Delay Distributions The time from infection to confirmation is assumed to bethe sum of the incubation period and the time taken from symptom onset to lab-oratory confirmation Therefore the time taken from infection to confirmationT (C ) is411ndash13

T (C ) sim Gamma(micro= 51σ

micro= 086)+Negative Binomial(micro= 525α= 157) (A17)

The time from infection to death is assumed to be the sum of the incubationperiod and the time taken from symptom onset to death Therefore the timetaken from infection to death T (D) is11114

T (D) sim Gamma(micro= 51σ

micro= 086)+Gamma(micro= 178

σ

micro= 045) (A18)

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 9: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

countries that have many deaths Additionally as we do not aim to infer the total number ofCOVID-19 infections we do not assume a specific infection fatality rate (IFR) or ascertain-ment rate (rate of testing) We proceed by summarising the model (Figure 2) A detaileddescription is given in Appendix A Code is available online here

The growth of the epidemic is determined by the time- and country-specific reproductionnumber Rt c which depends on a) the (unobserved) basic reproduction number R0c givenno active NPIs and b) the active NPIs at time t R0c accounts for all time-invariant factorsthat affect transmission in country c such as differences in demographics population den-sity culture and health systems13 We assume that the effect of each NPI on Rt c is stableacross countries and time The effectiveness of NPI i is represented by a parameter αi over which we place a symmetric prior with mean zero allowing both positive and negativeeffects Following Flaxman et al and others1ndash3 each NPIrsquos effect on Rt c is assumed toindependently affect Rt c as a multiplicative factor

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (1)

where φi ct = 1 indicates that NPI i is active in country c on day t (φi ct = 0 otherwise) andI is the number of NPIs The multiplicative effect encodes the plausible assumption thatNPIs have a smaller absolute effect when Rt c is already low We discuss the meaning ofeffectiveness estimates given NPI interactions in the Results section

In the early phase of an epidemic the number of new daily infections grows exponentiallyDuring exponential growth there is a one-to-one correspondence between the daily growthrate and Rt c 14 The correspondence depends on the generation interval (the time betweensuccessive infections in a chain of transmission) which we assume to have a Gamma dis-tribution with mean 667 days11516 We model the daily new infection count separatelyfor confirmed cases and deaths representing those infections which are later reported andthose which are later fatal However both infection numbers are assumed to grow at thesame daily rate in expectation allowing the use of both data sources to estimate each αi The infection numbers translate into reported confirmed cases and deaths after a stochasticdelay which is assumed to be equal across countries The delay is the sum of two indepen-dent gamma distributions assumed to be equal across countries the incubation period andthe delay from onset of symptoms to confirmation We use previously published empiricaldistributions from China and Italy16ndash19 which mutually agree and give a mean infection-to-confirmation delay of 1035 days Similarly the infection-to-death delay is the sum of theincubation period and the (gamma distributed) delay from onset of symptoms to death1720

which sum up to a mean delay of 229 days Finally both the reported deaths and casesfollow a negative binomial noise distribution an inferred dispersion parameter as in relatedNPI models13

Using a Markov chain Monte Carlo (MCMC) sampling algorithm21 this model infers poste-rior distributions of each NPIrsquos effectiveness while accounting for cross-country variationsin testing reporting and fatality rates However it relies on the key assumptions that NPIs

9

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

have equal effects across countries and time and that changes in R are due to the observedNPIs To analyse the extent to which modelling choices affect the results our sensitivityanalysis includes all epidemiological parameters prior distributions and many of the struc-tural assumptions introduced above (Appendix B2 and Appendix C) MCMC convergencestatistics are given in Appendix C

Results

NPI Effectiveness

Our model enables us to estimate the individual effectiveness of each NPI expressed asa percentage reduction in R As in related work1ndash3 this percentage reduction is modelledas constant over countries and time and independent of the other implemented NPIs Inpractice however NPI effectiveness may depend on other implemented NPIs and localcircumstances Thus our effectiveness estimates ought to be interpreted as the effectivenessaveraged over the contexts in which the NPI was implemented in our data5 Our results thusgive the average NPI effectiveness across typical situations that the NPIs were implementedin Figure 3 (bottom left) visualizes which NPIs typically co-occurred aiding interpretation

The mean percentage reduction in R (with 95 credible interval) associated with each NPIis as follows (Figure 3) mandating mask-wearing in (some) public spaces 2 (-14ndash16) limiting gatherings to 1000 people or less 2 (-20ndash22) to 100 people or less21 (1ndash39) to 10 people or less 36 (16ndash53) closing some high-risk businesses31 (13ndash46) closing most nonessential businesses 40 (22ndash55) closing schoolsand universities 39 (21ndash55) and issuing stay-at-home orders 18 (4ndash31)

Some NPIs frequently co-occur ie are collinear However we are able to isolate the ef-fects of individual NPIs since the collinearity is imperfect and our dataset is large For everypair of NPIs we observe one of them without the other for 748 country-days on average(Appendix D3) The minimum number of country-days for any NPI pair is 143 (for lim-iting gatherings to 1000 or 100 attendees) Additionally under excessive collinearity andinsufficient data to overcome it individual effectiveness estimates are highly sensitive tovariations in the data and model parameters22 High sensitivity prevented Flaxman et al1

who had a smaller dataset from disentangling NPI effects6 Our estimates are substantiallyless sensitive (see below) Finally the posterior correlations between the effectiveness esti-mates are weak suggesting manageable collinearity (Appendix D4)

Although the correlations between the individual estimates are weak we should take theminto account when evaluating combined effects of NPIs For example if two NPIs fre-quently co-occur there may be more certainty about the combined effect than about thetwo individual effects Figure 4 shows the combined effectiveness of the sets of NPIs thatare most common in our data All NPIs together reduce R by 82 (79ndash85) Acrossour countries the mean R without any NPIs (ie R0) is 38 matching the mean result ofFlaxman et al1 (Table D2 reports R0 for all countries) Starting from this number the esti-

10

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-100 -50 0 50 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spaces

Gatherings limited to1000 people or less

Gatherings limited to100 people or less

Gatherings limited to10 people or less

Some businessesclosed

Most nonessentialbusinesses closed

Schools and universitiesclosed

Stay-at-home order(with exemptions)

北 便

i

j

Frequency i Active Given j Active

25

50

75

100

0 500 1000 1500 2000 2500 3000

Days

Total Days Active

Figure 3 Top NPI effects The Figure shows the average percentage reductions in R as observed inour data (or in terms of the model the posterior marginal distributions of 1minusexp(minusαi )) with median50 and 95 credible intervals A negative 1 reduction refers to a 1 increase in R Cumulativeeffects are shown for hierarchical NPIs (gathering bans and business closures) ie the result for Mostnonessential businesses closed shows the cumulative effect of two NPIs with separate parameters andsymbols - closing some (high-risk) businesses and additionally closing most remaining (non-high-riskbut nonessential) businesses given that some businesses are already closed Finally we show the jointeffect of closing both schools and universities because the dates of school and university closures nearlyperfectly coincide in our data and we cannot meaningfully isolate their individual effects (AppendixD2) Bottom Left Conditional activation matrix Cell values indicate the frequency that NPI i (x-axis)was active given that NPI j (y-axis) was active Eg schools were always closed whenever a stay-at-home order was active (bottom row third column from the right) but not vice versa Bottom RightTotal number of days each NPI was active across all countries

11

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearingMost businesses closed Schools Closed Universities Closed Stay-at-home order

Some businessesclosed

Figure 4 Combined NPI effectiveness for the most common sets of NPIs in our data by size of the NPI setShaded regions denote 50 and 95 credible intervals Left Maximum R0 that can be reduced to below1 for each set of NPIs Right Predicted R after implementation of each set of NPIs assuming R0 = 38Readers can interactively explore the effects of all sets of NPIs at httpepidemicforecastingorgcalc

mated R can be reduced below 1 by closing schools and universities high-risk businessesand limiting gathering sizes Readers can interactively explore the effects of sets of NPIsat httpepidemicforecastingorgcalc A CSV file containing the joint effectiveness of allpossible NPI combinations is available online here

Validation

We perform a range of experiments to study the robustness and calibration of our NPI effec-tiveness estimates (Appendix B with further experiments in Appendix C) We analyse howthe model extrapolates to unseen countries and periods and perform multiple sensitivityanalyses Amongst other things we analyse how results change if we vary epidemiologicalparameters or vary the data (using only deaths or confirmed cases as observations ex-cluding countries from the data) To investigate our key assumptions we show results forseveral alternative models (structural sensitivity) analyze the role of NPI timing and ex-amine possible confounding of our estimates by unobserved factors influencing R Figure 5summarises these analyses by showing how each NPIrsquos effectiveness is ranked comparedto other NPIs and how its rank is distributed across all experiment conditions The strongagreement across these many analyses increases our confidence in the results while alsoshowing that the precise effectiveness estimates come with additional uncertainty

Discussion

We use a data-driven approach to estimate the effects of eight nonpharmaceutical interven-tions on COVID-19 transmission in 41 countries All eight NPIs together reduce R by 82

12

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

12345678NPI Median Effectiveness Ranking

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Global NPI Rank Stability

25

50

75

100

Figure 5 Ranking of NPIs by median effectiveness across all sensitivity analyses (15 sensitivity analyseswith a total of 96 experiment conditions) In each analysis NPI effects were estimated under severaldifferent plausible variations of the model or the data (Appendix B and Appendix C) The colourindicates in which fraction of all experiment conditions an NPI occupied a given rank The sensitivityanalyses aggregated in this Figure are shown in Figures B8 B9 C11 C12 C13 and B7A

(79ndash85) This finding is in strong agreement with the joint effect estimated in elevencountries by Flaxman et al1 and contributes to the mounting evidence that NPIs can beeffective at mitigating and suppressing outbreaks of COVID-19 Furthermore our resultssuggest that some NPIs outperform others While the exact effectiveness estimates varymildly the qualitative conclusions discussed below are robust across 15 sensitivity analyses

Business closures and gathering bans both seem effective at reducing COVID-19 transmis-sion Closing only high-risk businesses (mean reduction in R 31) appears only somewhatless effective than closing most nonessential businesses (40) making it the more promis-ing policy option in some circumstances Limiting gatherings to 10 people or less (36) wasmore effective than limits up to 100 (21) or 1000 people (2) This may reflect that smallgatherings are common As previously discussed we estimate the average additional effecteach NPI had in the contexts where it was implemented When countries introduced stay-at-home orders they nearly always also banned gatherings and closed schools universitiesand nonessential businesses if they had not done so already Flaxman et al1 and Hsianget al4 add the effect of these distinct NPIs to the effectiveness of stay-at-home orders andaccordingly find a large effect In contrast we and Banholzer et al3 isolate the additionaleffect of ordering the population to stay at home and instead find a smaller effect (18) Atypical country can reduce R to below 1 without a stay-at-home order (Figure 4) providedother NPIs stay active

Mandating mask-wearing in various public spaces had a small positive effect on averagein the countries we studied (2) This does not rule out that mask-wearing has a largereffect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaces

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

wearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier223 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 and open questionsabout different types of masks play a larger role potentially reducing effectiveness Whileour results cast doubt on reports that mask-wearing is the main determinant shaping acountryrsquos epidemic23 the policy still seems promising given all available evidence due toits comparatively low economic and social costs Its effectiveness may increase as otherNPIs are lifted and public interactions return

We find a surprisingly large effect for school and university closures an average 39 re-duction in R This finding is remarkably robust across different model structures variationsin the data and epidemiological assumptions (Figure B7) It remains robust when control-ling for NPIs excluded from our study and the onset time of major government intervention(Figures B9 and C10) Since school and university closures almost perfectly coincide inthe countries we study an approach such as ours cannot distinguish their individual effects(Appendix D2) Furthermore it cannot distinguish direct and indirect effects such asforcing parents to stay at home or causing broader behaviour changes by increasing publicconcern

Previous evidence on school and university closures is mixed1326 Early data suggest thatchildren and young adults are equally susceptible to infection but have a notably lowerobserved case rate than older adultsmdashwhether this is due to school and university closuresremains unknown27ndash29 Although infected young people are often asymptomatic they ap-pear to shed similar amounts of virus as older people3031 and might therefore circulatethe infection to higher-risk demographics unknowingly This also limits our ability to detectlarge outbreaks in educational facilities which closed in nearly all countries before suchdetection was feasible (with exceptions32) As outbreaks detected in UK schools are rapidlyincreasing33 this topic merits further study

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset5

and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B2) However this cannot give final certainty Investigating the role of unob-served effects is an important topic for future investigations Third our results cannot beused without qualification to predict the effect of lifting NPIs For example closing schoolsand universities seems to have greatly reduced transmission but this does not mean that re-opening them will cause infections to soar Educational institutions can (and do) implementsafety measures such as reduced class sizes Further work is needed to analyse the effectsof reopenings our collected data may be instrumental Fourth while we included more

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

NPIs than previous work (Table F4) several promising NPIs were excluded For exampletesting tracing and case isolation may be an important part of a cost-effective epidemicresponse34 but were not included because it is difficult to obtain comprehensive data Wediscuss further limitations in Appendix E

Currently governments across the world are seeking to keep R below 1 while minimising thesocial and economic costs of their interventions We hope that our results can guide policydecisions on which restrictions to lift and which NPIs to implement in any potential secondwave of infections Additionally our results show which areas of public life are most inneed of restructuring so that they can continue despite the pandemic However our resultsshould not be seen as the final answer on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies experimental trialssimulations and clinical experience

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

We thank Jacob Lagerros for operational support and for introducing some of the authorsto each other We thank Maksym Balatsko Marek Pukaj and Tomaacuteš Witzany for developingthe interactive website

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier MA Hartwick LChindelevitch designed the model and modelling experiments M Sharma AB StephensonT Gavenciak J Salvatier performed and analysed the modelling experiments J KulveitT Gavenciak JM Brauner conceived the research S Mindermann T Besiroglu J KulveitJM Brauner did the literature search JM Brauner S Mindermann M Sharma G LeechT Besiroglu V Mikulik wrote the manuscript All authors read and gave feedback on themanuscript and approved the final manuscript JM Brauner S Mindermann and M Sharmacontributed equally Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

3 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

4 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

5 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

6 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

7 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

8 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

9 Johns Hopkins University Center for Systems Science and Engineering COVID-19 DataRepository by the Center for Systems Science and Engineering (CSSE) at Johns HopkinsUniversity httpsgithubcomCSSEGISandDataCOVID-19 2020

10 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

11 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

12 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

13 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

14 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

15 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

19 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

20 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

28 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

29 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

30 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

31 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

32 Arnaud Fontanet et al ldquoCluster of COVID-19 in northern France A retrospective closedcohort studyrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041820071134

33 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 26 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile895356Weekly_COVID19_Surveillance_Report_w26pdf2020

34 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 21

Appendix A1 Detailed model description 21

Appendix A2 Technical Model Description 24

Appendix B Validation 28

Appendix B1 Unseen data 28

Appendix B2 Sensitivity Analysis 28

Appendix B3 Robustness to unobserved effects 31

Appendix C Additional sensitivity analyses and validation 33

Appendix C1 Role of NPI timing 33

Appendix C2 Sensitivity to data preprocessing 34

Appendix C3 Sensitivity to additional epidemiological assumptions 35

Appendix C4 Additional country exclusions 37

Appendix C5 Validation using predictions in excluded countries 38

Appendix C6 MCMC stability results 43

Appendix C7 Posterior predictive distributions 44

Appendix D Additional results 45

Appendix D1 Estimated R0 by country 45

Appendix D2 The individual effects of school and university closures 46

Appendix D3 Collinearity 46

Appendix D4 Correlations between effectiveness estimates 47

Appendix E Additional discussion of assumptions and limitations 49

Appendix E1 Limitations of the data 49

Appendix E2 Model limitations 49

Appendix F Overview of previous work 50

Appendix G Handling edge cases in the data collection 53

Appendix H References 57

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure A6 Model Overview Purple nodes are observed or have a fixed distribution We describe thediagram from bottom to top Each NPIrsquos effectiveness is characterised by αi which is independent ofthe country On each day a countryrsquos daily reproduction number Rt c only depends on that countryrsquosbase reproduction number R0c and the active NPIs (Φi t c ) Rt c is transformed to the daily growth rategt c which is used to compute the new infections N (C )

t c and N (D)t c that will turn into confirmed cases

and deaths respectively Finally the number of new confirmed cases Ct c and deaths Dt c is computedby convolution of N (middot)

t c with the respective delay distributions The same model structure is used forconfirmed cases and deaths The model combines both observations it splits all nodes above the dailygrowth rate gt c into separate branches for deaths and cases

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1 Themain difference is that we model both confirmed cases and deaths allowing us to leveragesignificantly more data Furthermore we do not assume a specific infection fatality rate(IFR) since we do not aim to infer the total number of COVID-19 infections The end of thissection details further adaptations which allow us to minimize assumptions about testingreporting and the IFR A list of all technical details is given in Appendix A2

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior (and hyperprior) distribution over R0c reflecting the wide disagree-ment of regional estimates of R02 We parameterize the effectiveness of NPI i assumed to

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

be same across countries and time with αi Each NPI is assumed to have an independentmultiplicative as on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We use a weakly informative symmetric prior αi simN (002) allowingfor both positive and negative effects because we presently cannot rule out that some NPIsdirectly or indirectly increase transmission

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c andRt c 3

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We assume that the generationinterval distribution is given by a gamma distribution with mean 667 days and standarddeviation 21 The mean is based on an Italian study4 which is deemed most relevant toEuropean countries5 and the standard deviation stems from a international set of countriessince in European countries it has only been estimated for the serial rather than generationinterval to our knowledge67 Using (A2) we can write g t c as g t c (Rt c ) (see Appendix A2)

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are backwards-inferred from the observation models for casesand deaths shown further below We assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)](A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)](A4)

aMany epidemiological models define growth rates as the exponent r in an exponential growth functionHere we use daily growth rates instead for ease of exposition These choices are mathematically equivalentNote that we adapted equation (29) in Wallinga amp Lipsitch3 to account for our choice

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross validation as no reference is available for it We did not tune any otheraspect of the modelmdashinstead we use choices from Flaxman et al1 or the most relevantavailable sources We evaluate five different values (σN isin 00501020304) fitting themodel on 35 countries each time and evaluating on a fixed randomly chosen validationset of 6 countries We select σN = 02 to maximize the log-likelihood on the validation setCross validation ensures a more calibrated model less likely to produce overconfident andunstable estimates10 The final model with σN = 02 is then evaluated on data from 20 held-out days in all countries which were not used to adjust any aspect of the model (FigureB7) We find that different values for σN produce very similar effectiveness estimates butlarger values lead to greater uncertainty and robustness (Figure C12 and Sharma et al9)

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation This delaydistribution is the sum of two independent gamma distributions the incubation periodand the delay from onset of symptoms to confirmation We use previously published andagreeing empirical distributions from China and Italy411ndash13 which sum up to a mean delayof 1035 days Finally the observed cases Ct c follow a negative binomial noise distributionwith mean C t c and an inferred dispersion parameter following Flaxman et al1

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ)

where PD (delay) is the distribution of the delay from infection to death It is also the sumof two independent gamma distributions the aforementioned incubation period and thedelay from onset of symptoms to death1114 which sum up to a mean delay of 229 daysFinally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and the same inferred dispersion parameter used for observed cases

bSince we treat new infections as a continuous number its initial value can (and often should) be between0 and 1

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Single and combined models To construct models which only use either confirmed cases ordeaths as observations we remove the variables corresponding to the disregarded observa-tions

Testing reporting and infection fatality rates Scaling all values of a time series by a con-stant does not change its growth rates The model is therefore invariant to the scale of theobservations and consequently to country-level differences in the IFR and the ascertainmentrate (the proportion of infected people who are subsequently reported positive) For exam-ple assume countries A and B differ only in their ascertainment rates Then our model willinfer a difference in N (C )

t c (Eq (A5)) but not in the growth rates g t c across A and B (Eq(A3)-(A4)) Accordingly the inferred NPI effectiveness will be identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas theintroduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimic changes in the ascertainment ratemdashnoise at time τ

affects all future casesmdashand allows for gradual multiplicative changes in the ascertainmentrate

We infer the unobserved variables in our model using Hamiltonian Monte-Carlo1516 (HMC)a standard MCMC sampling algorithm

Appendix A2 Technical Model Description

Variables are indexed by intervention i country c and day t All prior distributions areindependent

bull Data1 NPI Activations φi t c isin 012 Smoothed Observed Cases Ct c 3 Smoothed Observed Deaths D t c

bull Prior Distributions

cThis is only approximately true The negative binomial output distribution has a coefficient of variationdiminishing with its mean ie smaller observations are relatively more noisy and carry less weight Further-more whilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligibleeffect

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

1 Country-specific R0

R0c = Normal(R0κ) (A6)

R0 = 325 based on a meta analysis17 (A7)

κsim Half Normal(micro= 0σ= 05) (A8)

2 NPI Effectiveness

αi sim Normal(micro= 0σ=p02) (A9)

(A10)

3 Infection Initial Counts (uninformative priors)

N (C )0c = exp(ζ(C )

c ) (A11)

N (D)0c = exp(ζ(D)

c ) (A12)

ζ(C )c sim Normal(micro= 0σ= 50) (A13)

ζ(D)c sim Normal(micro= 0σ= 50) (A14)

(A15)

4 Observation Noise Dispersion Parameter

Ψsim Half Normal(micro= 0σ= 5) (A16)

bull Hyperparameters1 Infection Noise Scale σN = 02 (selected by cross-validation)

bull Epidemiological parameters1 Generation Interval Parameters The generation interval is assumed to have a

Gamma distribution with mean 667 and standard deviation 21 days4ndash6 This leadsto a distribution Gamma(α= 79β= 12)

2 Delay Distributions The time from infection to confirmation is assumed to bethe sum of the incubation period and the time taken from symptom onset to lab-oratory confirmation Therefore the time taken from infection to confirmationT (C ) is411ndash13

T (C ) sim Gamma(micro= 51σ

micro= 086)+Negative Binomial(micro= 525α= 157) (A17)

The time from infection to death is assumed to be the sum of the incubationperiod and the time taken from symptom onset to death Therefore the timetaken from infection to death T (D) is11114

T (D) sim Gamma(micro= 51σ

micro= 086)+Gamma(micro= 178

σ

micro= 045) (A18)

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 10: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

have equal effects across countries and time and that changes in R are due to the observedNPIs To analyse the extent to which modelling choices affect the results our sensitivityanalysis includes all epidemiological parameters prior distributions and many of the struc-tural assumptions introduced above (Appendix B2 and Appendix C) MCMC convergencestatistics are given in Appendix C

Results

NPI Effectiveness

Our model enables us to estimate the individual effectiveness of each NPI expressed asa percentage reduction in R As in related work1ndash3 this percentage reduction is modelledas constant over countries and time and independent of the other implemented NPIs Inpractice however NPI effectiveness may depend on other implemented NPIs and localcircumstances Thus our effectiveness estimates ought to be interpreted as the effectivenessaveraged over the contexts in which the NPI was implemented in our data5 Our results thusgive the average NPI effectiveness across typical situations that the NPIs were implementedin Figure 3 (bottom left) visualizes which NPIs typically co-occurred aiding interpretation

The mean percentage reduction in R (with 95 credible interval) associated with each NPIis as follows (Figure 3) mandating mask-wearing in (some) public spaces 2 (-14ndash16) limiting gatherings to 1000 people or less 2 (-20ndash22) to 100 people or less21 (1ndash39) to 10 people or less 36 (16ndash53) closing some high-risk businesses31 (13ndash46) closing most nonessential businesses 40 (22ndash55) closing schoolsand universities 39 (21ndash55) and issuing stay-at-home orders 18 (4ndash31)

Some NPIs frequently co-occur ie are collinear However we are able to isolate the ef-fects of individual NPIs since the collinearity is imperfect and our dataset is large For everypair of NPIs we observe one of them without the other for 748 country-days on average(Appendix D3) The minimum number of country-days for any NPI pair is 143 (for lim-iting gatherings to 1000 or 100 attendees) Additionally under excessive collinearity andinsufficient data to overcome it individual effectiveness estimates are highly sensitive tovariations in the data and model parameters22 High sensitivity prevented Flaxman et al1

who had a smaller dataset from disentangling NPI effects6 Our estimates are substantiallyless sensitive (see below) Finally the posterior correlations between the effectiveness esti-mates are weak suggesting manageable collinearity (Appendix D4)

Although the correlations between the individual estimates are weak we should take theminto account when evaluating combined effects of NPIs For example if two NPIs fre-quently co-occur there may be more certainty about the combined effect than about thetwo individual effects Figure 4 shows the combined effectiveness of the sets of NPIs thatare most common in our data All NPIs together reduce R by 82 (79ndash85) Acrossour countries the mean R without any NPIs (ie R0) is 38 matching the mean result ofFlaxman et al1 (Table D2 reports R0 for all countries) Starting from this number the esti-

10

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-100 -50 0 50 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spaces

Gatherings limited to1000 people or less

Gatherings limited to100 people or less

Gatherings limited to10 people or less

Some businessesclosed

Most nonessentialbusinesses closed

Schools and universitiesclosed

Stay-at-home order(with exemptions)

北 便

i

j

Frequency i Active Given j Active

25

50

75

100

0 500 1000 1500 2000 2500 3000

Days

Total Days Active

Figure 3 Top NPI effects The Figure shows the average percentage reductions in R as observed inour data (or in terms of the model the posterior marginal distributions of 1minusexp(minusαi )) with median50 and 95 credible intervals A negative 1 reduction refers to a 1 increase in R Cumulativeeffects are shown for hierarchical NPIs (gathering bans and business closures) ie the result for Mostnonessential businesses closed shows the cumulative effect of two NPIs with separate parameters andsymbols - closing some (high-risk) businesses and additionally closing most remaining (non-high-riskbut nonessential) businesses given that some businesses are already closed Finally we show the jointeffect of closing both schools and universities because the dates of school and university closures nearlyperfectly coincide in our data and we cannot meaningfully isolate their individual effects (AppendixD2) Bottom Left Conditional activation matrix Cell values indicate the frequency that NPI i (x-axis)was active given that NPI j (y-axis) was active Eg schools were always closed whenever a stay-at-home order was active (bottom row third column from the right) but not vice versa Bottom RightTotal number of days each NPI was active across all countries

11

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearingMost businesses closed Schools Closed Universities Closed Stay-at-home order

Some businessesclosed

Figure 4 Combined NPI effectiveness for the most common sets of NPIs in our data by size of the NPI setShaded regions denote 50 and 95 credible intervals Left Maximum R0 that can be reduced to below1 for each set of NPIs Right Predicted R after implementation of each set of NPIs assuming R0 = 38Readers can interactively explore the effects of all sets of NPIs at httpepidemicforecastingorgcalc

mated R can be reduced below 1 by closing schools and universities high-risk businessesand limiting gathering sizes Readers can interactively explore the effects of sets of NPIsat httpepidemicforecastingorgcalc A CSV file containing the joint effectiveness of allpossible NPI combinations is available online here

Validation

We perform a range of experiments to study the robustness and calibration of our NPI effec-tiveness estimates (Appendix B with further experiments in Appendix C) We analyse howthe model extrapolates to unseen countries and periods and perform multiple sensitivityanalyses Amongst other things we analyse how results change if we vary epidemiologicalparameters or vary the data (using only deaths or confirmed cases as observations ex-cluding countries from the data) To investigate our key assumptions we show results forseveral alternative models (structural sensitivity) analyze the role of NPI timing and ex-amine possible confounding of our estimates by unobserved factors influencing R Figure 5summarises these analyses by showing how each NPIrsquos effectiveness is ranked comparedto other NPIs and how its rank is distributed across all experiment conditions The strongagreement across these many analyses increases our confidence in the results while alsoshowing that the precise effectiveness estimates come with additional uncertainty

Discussion

We use a data-driven approach to estimate the effects of eight nonpharmaceutical interven-tions on COVID-19 transmission in 41 countries All eight NPIs together reduce R by 82

12

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

12345678NPI Median Effectiveness Ranking

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Global NPI Rank Stability

25

50

75

100

Figure 5 Ranking of NPIs by median effectiveness across all sensitivity analyses (15 sensitivity analyseswith a total of 96 experiment conditions) In each analysis NPI effects were estimated under severaldifferent plausible variations of the model or the data (Appendix B and Appendix C) The colourindicates in which fraction of all experiment conditions an NPI occupied a given rank The sensitivityanalyses aggregated in this Figure are shown in Figures B8 B9 C11 C12 C13 and B7A

(79ndash85) This finding is in strong agreement with the joint effect estimated in elevencountries by Flaxman et al1 and contributes to the mounting evidence that NPIs can beeffective at mitigating and suppressing outbreaks of COVID-19 Furthermore our resultssuggest that some NPIs outperform others While the exact effectiveness estimates varymildly the qualitative conclusions discussed below are robust across 15 sensitivity analyses

Business closures and gathering bans both seem effective at reducing COVID-19 transmis-sion Closing only high-risk businesses (mean reduction in R 31) appears only somewhatless effective than closing most nonessential businesses (40) making it the more promis-ing policy option in some circumstances Limiting gatherings to 10 people or less (36) wasmore effective than limits up to 100 (21) or 1000 people (2) This may reflect that smallgatherings are common As previously discussed we estimate the average additional effecteach NPI had in the contexts where it was implemented When countries introduced stay-at-home orders they nearly always also banned gatherings and closed schools universitiesand nonessential businesses if they had not done so already Flaxman et al1 and Hsianget al4 add the effect of these distinct NPIs to the effectiveness of stay-at-home orders andaccordingly find a large effect In contrast we and Banholzer et al3 isolate the additionaleffect of ordering the population to stay at home and instead find a smaller effect (18) Atypical country can reduce R to below 1 without a stay-at-home order (Figure 4) providedother NPIs stay active

Mandating mask-wearing in various public spaces had a small positive effect on averagein the countries we studied (2) This does not rule out that mask-wearing has a largereffect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaces

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

wearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier223 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 and open questionsabout different types of masks play a larger role potentially reducing effectiveness Whileour results cast doubt on reports that mask-wearing is the main determinant shaping acountryrsquos epidemic23 the policy still seems promising given all available evidence due toits comparatively low economic and social costs Its effectiveness may increase as otherNPIs are lifted and public interactions return

We find a surprisingly large effect for school and university closures an average 39 re-duction in R This finding is remarkably robust across different model structures variationsin the data and epidemiological assumptions (Figure B7) It remains robust when control-ling for NPIs excluded from our study and the onset time of major government intervention(Figures B9 and C10) Since school and university closures almost perfectly coincide inthe countries we study an approach such as ours cannot distinguish their individual effects(Appendix D2) Furthermore it cannot distinguish direct and indirect effects such asforcing parents to stay at home or causing broader behaviour changes by increasing publicconcern

Previous evidence on school and university closures is mixed1326 Early data suggest thatchildren and young adults are equally susceptible to infection but have a notably lowerobserved case rate than older adultsmdashwhether this is due to school and university closuresremains unknown27ndash29 Although infected young people are often asymptomatic they ap-pear to shed similar amounts of virus as older people3031 and might therefore circulatethe infection to higher-risk demographics unknowingly This also limits our ability to detectlarge outbreaks in educational facilities which closed in nearly all countries before suchdetection was feasible (with exceptions32) As outbreaks detected in UK schools are rapidlyincreasing33 this topic merits further study

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset5

and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B2) However this cannot give final certainty Investigating the role of unob-served effects is an important topic for future investigations Third our results cannot beused without qualification to predict the effect of lifting NPIs For example closing schoolsand universities seems to have greatly reduced transmission but this does not mean that re-opening them will cause infections to soar Educational institutions can (and do) implementsafety measures such as reduced class sizes Further work is needed to analyse the effectsof reopenings our collected data may be instrumental Fourth while we included more

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

NPIs than previous work (Table F4) several promising NPIs were excluded For exampletesting tracing and case isolation may be an important part of a cost-effective epidemicresponse34 but were not included because it is difficult to obtain comprehensive data Wediscuss further limitations in Appendix E

Currently governments across the world are seeking to keep R below 1 while minimising thesocial and economic costs of their interventions We hope that our results can guide policydecisions on which restrictions to lift and which NPIs to implement in any potential secondwave of infections Additionally our results show which areas of public life are most inneed of restructuring so that they can continue despite the pandemic However our resultsshould not be seen as the final answer on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies experimental trialssimulations and clinical experience

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

We thank Jacob Lagerros for operational support and for introducing some of the authorsto each other We thank Maksym Balatsko Marek Pukaj and Tomaacuteš Witzany for developingthe interactive website

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier MA Hartwick LChindelevitch designed the model and modelling experiments M Sharma AB StephensonT Gavenciak J Salvatier performed and analysed the modelling experiments J KulveitT Gavenciak JM Brauner conceived the research S Mindermann T Besiroglu J KulveitJM Brauner did the literature search JM Brauner S Mindermann M Sharma G LeechT Besiroglu V Mikulik wrote the manuscript All authors read and gave feedback on themanuscript and approved the final manuscript JM Brauner S Mindermann and M Sharmacontributed equally Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

3 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

4 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

5 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

6 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

7 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

8 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

9 Johns Hopkins University Center for Systems Science and Engineering COVID-19 DataRepository by the Center for Systems Science and Engineering (CSSE) at Johns HopkinsUniversity httpsgithubcomCSSEGISandDataCOVID-19 2020

10 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

11 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

12 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

13 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

14 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

15 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

19 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

20 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

28 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

29 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

30 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

31 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

32 Arnaud Fontanet et al ldquoCluster of COVID-19 in northern France A retrospective closedcohort studyrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041820071134

33 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 26 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile895356Weekly_COVID19_Surveillance_Report_w26pdf2020

34 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 21

Appendix A1 Detailed model description 21

Appendix A2 Technical Model Description 24

Appendix B Validation 28

Appendix B1 Unseen data 28

Appendix B2 Sensitivity Analysis 28

Appendix B3 Robustness to unobserved effects 31

Appendix C Additional sensitivity analyses and validation 33

Appendix C1 Role of NPI timing 33

Appendix C2 Sensitivity to data preprocessing 34

Appendix C3 Sensitivity to additional epidemiological assumptions 35

Appendix C4 Additional country exclusions 37

Appendix C5 Validation using predictions in excluded countries 38

Appendix C6 MCMC stability results 43

Appendix C7 Posterior predictive distributions 44

Appendix D Additional results 45

Appendix D1 Estimated R0 by country 45

Appendix D2 The individual effects of school and university closures 46

Appendix D3 Collinearity 46

Appendix D4 Correlations between effectiveness estimates 47

Appendix E Additional discussion of assumptions and limitations 49

Appendix E1 Limitations of the data 49

Appendix E2 Model limitations 49

Appendix F Overview of previous work 50

Appendix G Handling edge cases in the data collection 53

Appendix H References 57

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure A6 Model Overview Purple nodes are observed or have a fixed distribution We describe thediagram from bottom to top Each NPIrsquos effectiveness is characterised by αi which is independent ofthe country On each day a countryrsquos daily reproduction number Rt c only depends on that countryrsquosbase reproduction number R0c and the active NPIs (Φi t c ) Rt c is transformed to the daily growth rategt c which is used to compute the new infections N (C )

t c and N (D)t c that will turn into confirmed cases

and deaths respectively Finally the number of new confirmed cases Ct c and deaths Dt c is computedby convolution of N (middot)

t c with the respective delay distributions The same model structure is used forconfirmed cases and deaths The model combines both observations it splits all nodes above the dailygrowth rate gt c into separate branches for deaths and cases

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1 Themain difference is that we model both confirmed cases and deaths allowing us to leveragesignificantly more data Furthermore we do not assume a specific infection fatality rate(IFR) since we do not aim to infer the total number of COVID-19 infections The end of thissection details further adaptations which allow us to minimize assumptions about testingreporting and the IFR A list of all technical details is given in Appendix A2

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior (and hyperprior) distribution over R0c reflecting the wide disagree-ment of regional estimates of R02 We parameterize the effectiveness of NPI i assumed to

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

be same across countries and time with αi Each NPI is assumed to have an independentmultiplicative as on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We use a weakly informative symmetric prior αi simN (002) allowingfor both positive and negative effects because we presently cannot rule out that some NPIsdirectly or indirectly increase transmission

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c andRt c 3

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We assume that the generationinterval distribution is given by a gamma distribution with mean 667 days and standarddeviation 21 The mean is based on an Italian study4 which is deemed most relevant toEuropean countries5 and the standard deviation stems from a international set of countriessince in European countries it has only been estimated for the serial rather than generationinterval to our knowledge67 Using (A2) we can write g t c as g t c (Rt c ) (see Appendix A2)

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are backwards-inferred from the observation models for casesand deaths shown further below We assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)](A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)](A4)

aMany epidemiological models define growth rates as the exponent r in an exponential growth functionHere we use daily growth rates instead for ease of exposition These choices are mathematically equivalentNote that we adapted equation (29) in Wallinga amp Lipsitch3 to account for our choice

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross validation as no reference is available for it We did not tune any otheraspect of the modelmdashinstead we use choices from Flaxman et al1 or the most relevantavailable sources We evaluate five different values (σN isin 00501020304) fitting themodel on 35 countries each time and evaluating on a fixed randomly chosen validationset of 6 countries We select σN = 02 to maximize the log-likelihood on the validation setCross validation ensures a more calibrated model less likely to produce overconfident andunstable estimates10 The final model with σN = 02 is then evaluated on data from 20 held-out days in all countries which were not used to adjust any aspect of the model (FigureB7) We find that different values for σN produce very similar effectiveness estimates butlarger values lead to greater uncertainty and robustness (Figure C12 and Sharma et al9)

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation This delaydistribution is the sum of two independent gamma distributions the incubation periodand the delay from onset of symptoms to confirmation We use previously published andagreeing empirical distributions from China and Italy411ndash13 which sum up to a mean delayof 1035 days Finally the observed cases Ct c follow a negative binomial noise distributionwith mean C t c and an inferred dispersion parameter following Flaxman et al1

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ)

where PD (delay) is the distribution of the delay from infection to death It is also the sumof two independent gamma distributions the aforementioned incubation period and thedelay from onset of symptoms to death1114 which sum up to a mean delay of 229 daysFinally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and the same inferred dispersion parameter used for observed cases

bSince we treat new infections as a continuous number its initial value can (and often should) be between0 and 1

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Single and combined models To construct models which only use either confirmed cases ordeaths as observations we remove the variables corresponding to the disregarded observa-tions

Testing reporting and infection fatality rates Scaling all values of a time series by a con-stant does not change its growth rates The model is therefore invariant to the scale of theobservations and consequently to country-level differences in the IFR and the ascertainmentrate (the proportion of infected people who are subsequently reported positive) For exam-ple assume countries A and B differ only in their ascertainment rates Then our model willinfer a difference in N (C )

t c (Eq (A5)) but not in the growth rates g t c across A and B (Eq(A3)-(A4)) Accordingly the inferred NPI effectiveness will be identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas theintroduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimic changes in the ascertainment ratemdashnoise at time τ

affects all future casesmdashand allows for gradual multiplicative changes in the ascertainmentrate

We infer the unobserved variables in our model using Hamiltonian Monte-Carlo1516 (HMC)a standard MCMC sampling algorithm

Appendix A2 Technical Model Description

Variables are indexed by intervention i country c and day t All prior distributions areindependent

bull Data1 NPI Activations φi t c isin 012 Smoothed Observed Cases Ct c 3 Smoothed Observed Deaths D t c

bull Prior Distributions

cThis is only approximately true The negative binomial output distribution has a coefficient of variationdiminishing with its mean ie smaller observations are relatively more noisy and carry less weight Further-more whilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligibleeffect

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

1 Country-specific R0

R0c = Normal(R0κ) (A6)

R0 = 325 based on a meta analysis17 (A7)

κsim Half Normal(micro= 0σ= 05) (A8)

2 NPI Effectiveness

αi sim Normal(micro= 0σ=p02) (A9)

(A10)

3 Infection Initial Counts (uninformative priors)

N (C )0c = exp(ζ(C )

c ) (A11)

N (D)0c = exp(ζ(D)

c ) (A12)

ζ(C )c sim Normal(micro= 0σ= 50) (A13)

ζ(D)c sim Normal(micro= 0σ= 50) (A14)

(A15)

4 Observation Noise Dispersion Parameter

Ψsim Half Normal(micro= 0σ= 5) (A16)

bull Hyperparameters1 Infection Noise Scale σN = 02 (selected by cross-validation)

bull Epidemiological parameters1 Generation Interval Parameters The generation interval is assumed to have a

Gamma distribution with mean 667 and standard deviation 21 days4ndash6 This leadsto a distribution Gamma(α= 79β= 12)

2 Delay Distributions The time from infection to confirmation is assumed to bethe sum of the incubation period and the time taken from symptom onset to lab-oratory confirmation Therefore the time taken from infection to confirmationT (C ) is411ndash13

T (C ) sim Gamma(micro= 51σ

micro= 086)+Negative Binomial(micro= 525α= 157) (A17)

The time from infection to death is assumed to be the sum of the incubationperiod and the time taken from symptom onset to death Therefore the timetaken from infection to death T (D) is11114

T (D) sim Gamma(micro= 51σ

micro= 086)+Gamma(micro= 178

σ

micro= 045) (A18)

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 11: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

-100 -50 0 50 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spaces

Gatherings limited to1000 people or less

Gatherings limited to100 people or less

Gatherings limited to10 people or less

Some businessesclosed

Most nonessentialbusinesses closed

Schools and universitiesclosed

Stay-at-home order(with exemptions)

北 便

i

j

Frequency i Active Given j Active

25

50

75

100

0 500 1000 1500 2000 2500 3000

Days

Total Days Active

Figure 3 Top NPI effects The Figure shows the average percentage reductions in R as observed inour data (or in terms of the model the posterior marginal distributions of 1minusexp(minusαi )) with median50 and 95 credible intervals A negative 1 reduction refers to a 1 increase in R Cumulativeeffects are shown for hierarchical NPIs (gathering bans and business closures) ie the result for Mostnonessential businesses closed shows the cumulative effect of two NPIs with separate parameters andsymbols - closing some (high-risk) businesses and additionally closing most remaining (non-high-riskbut nonessential) businesses given that some businesses are already closed Finally we show the jointeffect of closing both schools and universities because the dates of school and university closures nearlyperfectly coincide in our data and we cannot meaningfully isolate their individual effects (AppendixD2) Bottom Left Conditional activation matrix Cell values indicate the frequency that NPI i (x-axis)was active given that NPI j (y-axis) was active Eg schools were always closed whenever a stay-at-home order was active (bottom row third column from the right) but not vice versa Bottom RightTotal number of days each NPI was active across all countries

11

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearingMost businesses closed Schools Closed Universities Closed Stay-at-home order

Some businessesclosed

Figure 4 Combined NPI effectiveness for the most common sets of NPIs in our data by size of the NPI setShaded regions denote 50 and 95 credible intervals Left Maximum R0 that can be reduced to below1 for each set of NPIs Right Predicted R after implementation of each set of NPIs assuming R0 = 38Readers can interactively explore the effects of all sets of NPIs at httpepidemicforecastingorgcalc

mated R can be reduced below 1 by closing schools and universities high-risk businessesand limiting gathering sizes Readers can interactively explore the effects of sets of NPIsat httpepidemicforecastingorgcalc A CSV file containing the joint effectiveness of allpossible NPI combinations is available online here

Validation

We perform a range of experiments to study the robustness and calibration of our NPI effec-tiveness estimates (Appendix B with further experiments in Appendix C) We analyse howthe model extrapolates to unseen countries and periods and perform multiple sensitivityanalyses Amongst other things we analyse how results change if we vary epidemiologicalparameters or vary the data (using only deaths or confirmed cases as observations ex-cluding countries from the data) To investigate our key assumptions we show results forseveral alternative models (structural sensitivity) analyze the role of NPI timing and ex-amine possible confounding of our estimates by unobserved factors influencing R Figure 5summarises these analyses by showing how each NPIrsquos effectiveness is ranked comparedto other NPIs and how its rank is distributed across all experiment conditions The strongagreement across these many analyses increases our confidence in the results while alsoshowing that the precise effectiveness estimates come with additional uncertainty

Discussion

We use a data-driven approach to estimate the effects of eight nonpharmaceutical interven-tions on COVID-19 transmission in 41 countries All eight NPIs together reduce R by 82

12

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

12345678NPI Median Effectiveness Ranking

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Global NPI Rank Stability

25

50

75

100

Figure 5 Ranking of NPIs by median effectiveness across all sensitivity analyses (15 sensitivity analyseswith a total of 96 experiment conditions) In each analysis NPI effects were estimated under severaldifferent plausible variations of the model or the data (Appendix B and Appendix C) The colourindicates in which fraction of all experiment conditions an NPI occupied a given rank The sensitivityanalyses aggregated in this Figure are shown in Figures B8 B9 C11 C12 C13 and B7A

(79ndash85) This finding is in strong agreement with the joint effect estimated in elevencountries by Flaxman et al1 and contributes to the mounting evidence that NPIs can beeffective at mitigating and suppressing outbreaks of COVID-19 Furthermore our resultssuggest that some NPIs outperform others While the exact effectiveness estimates varymildly the qualitative conclusions discussed below are robust across 15 sensitivity analyses

Business closures and gathering bans both seem effective at reducing COVID-19 transmis-sion Closing only high-risk businesses (mean reduction in R 31) appears only somewhatless effective than closing most nonessential businesses (40) making it the more promis-ing policy option in some circumstances Limiting gatherings to 10 people or less (36) wasmore effective than limits up to 100 (21) or 1000 people (2) This may reflect that smallgatherings are common As previously discussed we estimate the average additional effecteach NPI had in the contexts where it was implemented When countries introduced stay-at-home orders they nearly always also banned gatherings and closed schools universitiesand nonessential businesses if they had not done so already Flaxman et al1 and Hsianget al4 add the effect of these distinct NPIs to the effectiveness of stay-at-home orders andaccordingly find a large effect In contrast we and Banholzer et al3 isolate the additionaleffect of ordering the population to stay at home and instead find a smaller effect (18) Atypical country can reduce R to below 1 without a stay-at-home order (Figure 4) providedother NPIs stay active

Mandating mask-wearing in various public spaces had a small positive effect on averagein the countries we studied (2) This does not rule out that mask-wearing has a largereffect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaces

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

wearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier223 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 and open questionsabout different types of masks play a larger role potentially reducing effectiveness Whileour results cast doubt on reports that mask-wearing is the main determinant shaping acountryrsquos epidemic23 the policy still seems promising given all available evidence due toits comparatively low economic and social costs Its effectiveness may increase as otherNPIs are lifted and public interactions return

We find a surprisingly large effect for school and university closures an average 39 re-duction in R This finding is remarkably robust across different model structures variationsin the data and epidemiological assumptions (Figure B7) It remains robust when control-ling for NPIs excluded from our study and the onset time of major government intervention(Figures B9 and C10) Since school and university closures almost perfectly coincide inthe countries we study an approach such as ours cannot distinguish their individual effects(Appendix D2) Furthermore it cannot distinguish direct and indirect effects such asforcing parents to stay at home or causing broader behaviour changes by increasing publicconcern

Previous evidence on school and university closures is mixed1326 Early data suggest thatchildren and young adults are equally susceptible to infection but have a notably lowerobserved case rate than older adultsmdashwhether this is due to school and university closuresremains unknown27ndash29 Although infected young people are often asymptomatic they ap-pear to shed similar amounts of virus as older people3031 and might therefore circulatethe infection to higher-risk demographics unknowingly This also limits our ability to detectlarge outbreaks in educational facilities which closed in nearly all countries before suchdetection was feasible (with exceptions32) As outbreaks detected in UK schools are rapidlyincreasing33 this topic merits further study

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset5

and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B2) However this cannot give final certainty Investigating the role of unob-served effects is an important topic for future investigations Third our results cannot beused without qualification to predict the effect of lifting NPIs For example closing schoolsand universities seems to have greatly reduced transmission but this does not mean that re-opening them will cause infections to soar Educational institutions can (and do) implementsafety measures such as reduced class sizes Further work is needed to analyse the effectsof reopenings our collected data may be instrumental Fourth while we included more

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

NPIs than previous work (Table F4) several promising NPIs were excluded For exampletesting tracing and case isolation may be an important part of a cost-effective epidemicresponse34 but were not included because it is difficult to obtain comprehensive data Wediscuss further limitations in Appendix E

Currently governments across the world are seeking to keep R below 1 while minimising thesocial and economic costs of their interventions We hope that our results can guide policydecisions on which restrictions to lift and which NPIs to implement in any potential secondwave of infections Additionally our results show which areas of public life are most inneed of restructuring so that they can continue despite the pandemic However our resultsshould not be seen as the final answer on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies experimental trialssimulations and clinical experience

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

We thank Jacob Lagerros for operational support and for introducing some of the authorsto each other We thank Maksym Balatsko Marek Pukaj and Tomaacuteš Witzany for developingthe interactive website

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier MA Hartwick LChindelevitch designed the model and modelling experiments M Sharma AB StephensonT Gavenciak J Salvatier performed and analysed the modelling experiments J KulveitT Gavenciak JM Brauner conceived the research S Mindermann T Besiroglu J KulveitJM Brauner did the literature search JM Brauner S Mindermann M Sharma G LeechT Besiroglu V Mikulik wrote the manuscript All authors read and gave feedback on themanuscript and approved the final manuscript JM Brauner S Mindermann and M Sharmacontributed equally Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

3 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

4 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

5 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

6 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

7 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

8 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

9 Johns Hopkins University Center for Systems Science and Engineering COVID-19 DataRepository by the Center for Systems Science and Engineering (CSSE) at Johns HopkinsUniversity httpsgithubcomCSSEGISandDataCOVID-19 2020

10 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

11 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

12 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

13 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

14 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

15 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

19 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

20 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

28 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

29 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

30 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

31 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

32 Arnaud Fontanet et al ldquoCluster of COVID-19 in northern France A retrospective closedcohort studyrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041820071134

33 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 26 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile895356Weekly_COVID19_Surveillance_Report_w26pdf2020

34 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 21

Appendix A1 Detailed model description 21

Appendix A2 Technical Model Description 24

Appendix B Validation 28

Appendix B1 Unseen data 28

Appendix B2 Sensitivity Analysis 28

Appendix B3 Robustness to unobserved effects 31

Appendix C Additional sensitivity analyses and validation 33

Appendix C1 Role of NPI timing 33

Appendix C2 Sensitivity to data preprocessing 34

Appendix C3 Sensitivity to additional epidemiological assumptions 35

Appendix C4 Additional country exclusions 37

Appendix C5 Validation using predictions in excluded countries 38

Appendix C6 MCMC stability results 43

Appendix C7 Posterior predictive distributions 44

Appendix D Additional results 45

Appendix D1 Estimated R0 by country 45

Appendix D2 The individual effects of school and university closures 46

Appendix D3 Collinearity 46

Appendix D4 Correlations between effectiveness estimates 47

Appendix E Additional discussion of assumptions and limitations 49

Appendix E1 Limitations of the data 49

Appendix E2 Model limitations 49

Appendix F Overview of previous work 50

Appendix G Handling edge cases in the data collection 53

Appendix H References 57

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure A6 Model Overview Purple nodes are observed or have a fixed distribution We describe thediagram from bottom to top Each NPIrsquos effectiveness is characterised by αi which is independent ofthe country On each day a countryrsquos daily reproduction number Rt c only depends on that countryrsquosbase reproduction number R0c and the active NPIs (Φi t c ) Rt c is transformed to the daily growth rategt c which is used to compute the new infections N (C )

t c and N (D)t c that will turn into confirmed cases

and deaths respectively Finally the number of new confirmed cases Ct c and deaths Dt c is computedby convolution of N (middot)

t c with the respective delay distributions The same model structure is used forconfirmed cases and deaths The model combines both observations it splits all nodes above the dailygrowth rate gt c into separate branches for deaths and cases

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1 Themain difference is that we model both confirmed cases and deaths allowing us to leveragesignificantly more data Furthermore we do not assume a specific infection fatality rate(IFR) since we do not aim to infer the total number of COVID-19 infections The end of thissection details further adaptations which allow us to minimize assumptions about testingreporting and the IFR A list of all technical details is given in Appendix A2

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior (and hyperprior) distribution over R0c reflecting the wide disagree-ment of regional estimates of R02 We parameterize the effectiveness of NPI i assumed to

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

be same across countries and time with αi Each NPI is assumed to have an independentmultiplicative as on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We use a weakly informative symmetric prior αi simN (002) allowingfor both positive and negative effects because we presently cannot rule out that some NPIsdirectly or indirectly increase transmission

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c andRt c 3

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We assume that the generationinterval distribution is given by a gamma distribution with mean 667 days and standarddeviation 21 The mean is based on an Italian study4 which is deemed most relevant toEuropean countries5 and the standard deviation stems from a international set of countriessince in European countries it has only been estimated for the serial rather than generationinterval to our knowledge67 Using (A2) we can write g t c as g t c (Rt c ) (see Appendix A2)

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are backwards-inferred from the observation models for casesand deaths shown further below We assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)](A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)](A4)

aMany epidemiological models define growth rates as the exponent r in an exponential growth functionHere we use daily growth rates instead for ease of exposition These choices are mathematically equivalentNote that we adapted equation (29) in Wallinga amp Lipsitch3 to account for our choice

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross validation as no reference is available for it We did not tune any otheraspect of the modelmdashinstead we use choices from Flaxman et al1 or the most relevantavailable sources We evaluate five different values (σN isin 00501020304) fitting themodel on 35 countries each time and evaluating on a fixed randomly chosen validationset of 6 countries We select σN = 02 to maximize the log-likelihood on the validation setCross validation ensures a more calibrated model less likely to produce overconfident andunstable estimates10 The final model with σN = 02 is then evaluated on data from 20 held-out days in all countries which were not used to adjust any aspect of the model (FigureB7) We find that different values for σN produce very similar effectiveness estimates butlarger values lead to greater uncertainty and robustness (Figure C12 and Sharma et al9)

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation This delaydistribution is the sum of two independent gamma distributions the incubation periodand the delay from onset of symptoms to confirmation We use previously published andagreeing empirical distributions from China and Italy411ndash13 which sum up to a mean delayof 1035 days Finally the observed cases Ct c follow a negative binomial noise distributionwith mean C t c and an inferred dispersion parameter following Flaxman et al1

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ)

where PD (delay) is the distribution of the delay from infection to death It is also the sumof two independent gamma distributions the aforementioned incubation period and thedelay from onset of symptoms to death1114 which sum up to a mean delay of 229 daysFinally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and the same inferred dispersion parameter used for observed cases

bSince we treat new infections as a continuous number its initial value can (and often should) be between0 and 1

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Single and combined models To construct models which only use either confirmed cases ordeaths as observations we remove the variables corresponding to the disregarded observa-tions

Testing reporting and infection fatality rates Scaling all values of a time series by a con-stant does not change its growth rates The model is therefore invariant to the scale of theobservations and consequently to country-level differences in the IFR and the ascertainmentrate (the proportion of infected people who are subsequently reported positive) For exam-ple assume countries A and B differ only in their ascertainment rates Then our model willinfer a difference in N (C )

t c (Eq (A5)) but not in the growth rates g t c across A and B (Eq(A3)-(A4)) Accordingly the inferred NPI effectiveness will be identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas theintroduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimic changes in the ascertainment ratemdashnoise at time τ

affects all future casesmdashand allows for gradual multiplicative changes in the ascertainmentrate

We infer the unobserved variables in our model using Hamiltonian Monte-Carlo1516 (HMC)a standard MCMC sampling algorithm

Appendix A2 Technical Model Description

Variables are indexed by intervention i country c and day t All prior distributions areindependent

bull Data1 NPI Activations φi t c isin 012 Smoothed Observed Cases Ct c 3 Smoothed Observed Deaths D t c

bull Prior Distributions

cThis is only approximately true The negative binomial output distribution has a coefficient of variationdiminishing with its mean ie smaller observations are relatively more noisy and carry less weight Further-more whilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligibleeffect

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

1 Country-specific R0

R0c = Normal(R0κ) (A6)

R0 = 325 based on a meta analysis17 (A7)

κsim Half Normal(micro= 0σ= 05) (A8)

2 NPI Effectiveness

αi sim Normal(micro= 0σ=p02) (A9)

(A10)

3 Infection Initial Counts (uninformative priors)

N (C )0c = exp(ζ(C )

c ) (A11)

N (D)0c = exp(ζ(D)

c ) (A12)

ζ(C )c sim Normal(micro= 0σ= 50) (A13)

ζ(D)c sim Normal(micro= 0σ= 50) (A14)

(A15)

4 Observation Noise Dispersion Parameter

Ψsim Half Normal(micro= 0σ= 5) (A16)

bull Hyperparameters1 Infection Noise Scale σN = 02 (selected by cross-validation)

bull Epidemiological parameters1 Generation Interval Parameters The generation interval is assumed to have a

Gamma distribution with mean 667 and standard deviation 21 days4ndash6 This leadsto a distribution Gamma(α= 79β= 12)

2 Delay Distributions The time from infection to confirmation is assumed to bethe sum of the incubation period and the time taken from symptom onset to lab-oratory confirmation Therefore the time taken from infection to confirmationT (C ) is411ndash13

T (C ) sim Gamma(micro= 51σ

micro= 086)+Negative Binomial(micro= 525α= 157) (A17)

The time from infection to death is assumed to be the sum of the incubationperiod and the time taken from symptom onset to death Therefore the timetaken from infection to death T (D) is11114

T (D) sim Gamma(micro= 51σ

micro= 086)+Gamma(micro= 178

σ

micro= 045) (A18)

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 12: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Mask-wearingMost businesses closed Schools Closed Universities Closed Stay-at-home order

Some businessesclosed

Figure 4 Combined NPI effectiveness for the most common sets of NPIs in our data by size of the NPI setShaded regions denote 50 and 95 credible intervals Left Maximum R0 that can be reduced to below1 for each set of NPIs Right Predicted R after implementation of each set of NPIs assuming R0 = 38Readers can interactively explore the effects of all sets of NPIs at httpepidemicforecastingorgcalc

mated R can be reduced below 1 by closing schools and universities high-risk businessesand limiting gathering sizes Readers can interactively explore the effects of sets of NPIsat httpepidemicforecastingorgcalc A CSV file containing the joint effectiveness of allpossible NPI combinations is available online here

Validation

We perform a range of experiments to study the robustness and calibration of our NPI effec-tiveness estimates (Appendix B with further experiments in Appendix C) We analyse howthe model extrapolates to unseen countries and periods and perform multiple sensitivityanalyses Amongst other things we analyse how results change if we vary epidemiologicalparameters or vary the data (using only deaths or confirmed cases as observations ex-cluding countries from the data) To investigate our key assumptions we show results forseveral alternative models (structural sensitivity) analyze the role of NPI timing and ex-amine possible confounding of our estimates by unobserved factors influencing R Figure 5summarises these analyses by showing how each NPIrsquos effectiveness is ranked comparedto other NPIs and how its rank is distributed across all experiment conditions The strongagreement across these many analyses increases our confidence in the results while alsoshowing that the precise effectiveness estimates come with additional uncertainty

Discussion

We use a data-driven approach to estimate the effects of eight nonpharmaceutical interven-tions on COVID-19 transmission in 41 countries All eight NPIs together reduce R by 82

12

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

12345678NPI Median Effectiveness Ranking

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Global NPI Rank Stability

25

50

75

100

Figure 5 Ranking of NPIs by median effectiveness across all sensitivity analyses (15 sensitivity analyseswith a total of 96 experiment conditions) In each analysis NPI effects were estimated under severaldifferent plausible variations of the model or the data (Appendix B and Appendix C) The colourindicates in which fraction of all experiment conditions an NPI occupied a given rank The sensitivityanalyses aggregated in this Figure are shown in Figures B8 B9 C11 C12 C13 and B7A

(79ndash85) This finding is in strong agreement with the joint effect estimated in elevencountries by Flaxman et al1 and contributes to the mounting evidence that NPIs can beeffective at mitigating and suppressing outbreaks of COVID-19 Furthermore our resultssuggest that some NPIs outperform others While the exact effectiveness estimates varymildly the qualitative conclusions discussed below are robust across 15 sensitivity analyses

Business closures and gathering bans both seem effective at reducing COVID-19 transmis-sion Closing only high-risk businesses (mean reduction in R 31) appears only somewhatless effective than closing most nonessential businesses (40) making it the more promis-ing policy option in some circumstances Limiting gatherings to 10 people or less (36) wasmore effective than limits up to 100 (21) or 1000 people (2) This may reflect that smallgatherings are common As previously discussed we estimate the average additional effecteach NPI had in the contexts where it was implemented When countries introduced stay-at-home orders they nearly always also banned gatherings and closed schools universitiesand nonessential businesses if they had not done so already Flaxman et al1 and Hsianget al4 add the effect of these distinct NPIs to the effectiveness of stay-at-home orders andaccordingly find a large effect In contrast we and Banholzer et al3 isolate the additionaleffect of ordering the population to stay at home and instead find a smaller effect (18) Atypical country can reduce R to below 1 without a stay-at-home order (Figure 4) providedother NPIs stay active

Mandating mask-wearing in various public spaces had a small positive effect on averagein the countries we studied (2) This does not rule out that mask-wearing has a largereffect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaces

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

wearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier223 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 and open questionsabout different types of masks play a larger role potentially reducing effectiveness Whileour results cast doubt on reports that mask-wearing is the main determinant shaping acountryrsquos epidemic23 the policy still seems promising given all available evidence due toits comparatively low economic and social costs Its effectiveness may increase as otherNPIs are lifted and public interactions return

We find a surprisingly large effect for school and university closures an average 39 re-duction in R This finding is remarkably robust across different model structures variationsin the data and epidemiological assumptions (Figure B7) It remains robust when control-ling for NPIs excluded from our study and the onset time of major government intervention(Figures B9 and C10) Since school and university closures almost perfectly coincide inthe countries we study an approach such as ours cannot distinguish their individual effects(Appendix D2) Furthermore it cannot distinguish direct and indirect effects such asforcing parents to stay at home or causing broader behaviour changes by increasing publicconcern

Previous evidence on school and university closures is mixed1326 Early data suggest thatchildren and young adults are equally susceptible to infection but have a notably lowerobserved case rate than older adultsmdashwhether this is due to school and university closuresremains unknown27ndash29 Although infected young people are often asymptomatic they ap-pear to shed similar amounts of virus as older people3031 and might therefore circulatethe infection to higher-risk demographics unknowingly This also limits our ability to detectlarge outbreaks in educational facilities which closed in nearly all countries before suchdetection was feasible (with exceptions32) As outbreaks detected in UK schools are rapidlyincreasing33 this topic merits further study

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset5

and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B2) However this cannot give final certainty Investigating the role of unob-served effects is an important topic for future investigations Third our results cannot beused without qualification to predict the effect of lifting NPIs For example closing schoolsand universities seems to have greatly reduced transmission but this does not mean that re-opening them will cause infections to soar Educational institutions can (and do) implementsafety measures such as reduced class sizes Further work is needed to analyse the effectsof reopenings our collected data may be instrumental Fourth while we included more

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

NPIs than previous work (Table F4) several promising NPIs were excluded For exampletesting tracing and case isolation may be an important part of a cost-effective epidemicresponse34 but were not included because it is difficult to obtain comprehensive data Wediscuss further limitations in Appendix E

Currently governments across the world are seeking to keep R below 1 while minimising thesocial and economic costs of their interventions We hope that our results can guide policydecisions on which restrictions to lift and which NPIs to implement in any potential secondwave of infections Additionally our results show which areas of public life are most inneed of restructuring so that they can continue despite the pandemic However our resultsshould not be seen as the final answer on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies experimental trialssimulations and clinical experience

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

We thank Jacob Lagerros for operational support and for introducing some of the authorsto each other We thank Maksym Balatsko Marek Pukaj and Tomaacuteš Witzany for developingthe interactive website

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier MA Hartwick LChindelevitch designed the model and modelling experiments M Sharma AB StephensonT Gavenciak J Salvatier performed and analysed the modelling experiments J KulveitT Gavenciak JM Brauner conceived the research S Mindermann T Besiroglu J KulveitJM Brauner did the literature search JM Brauner S Mindermann M Sharma G LeechT Besiroglu V Mikulik wrote the manuscript All authors read and gave feedback on themanuscript and approved the final manuscript JM Brauner S Mindermann and M Sharmacontributed equally Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

3 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

4 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

5 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

6 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

7 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

8 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

9 Johns Hopkins University Center for Systems Science and Engineering COVID-19 DataRepository by the Center for Systems Science and Engineering (CSSE) at Johns HopkinsUniversity httpsgithubcomCSSEGISandDataCOVID-19 2020

10 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

11 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

12 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

13 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

14 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

15 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

19 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

20 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

28 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

29 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

30 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

31 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

32 Arnaud Fontanet et al ldquoCluster of COVID-19 in northern France A retrospective closedcohort studyrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041820071134

33 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 26 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile895356Weekly_COVID19_Surveillance_Report_w26pdf2020

34 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 21

Appendix A1 Detailed model description 21

Appendix A2 Technical Model Description 24

Appendix B Validation 28

Appendix B1 Unseen data 28

Appendix B2 Sensitivity Analysis 28

Appendix B3 Robustness to unobserved effects 31

Appendix C Additional sensitivity analyses and validation 33

Appendix C1 Role of NPI timing 33

Appendix C2 Sensitivity to data preprocessing 34

Appendix C3 Sensitivity to additional epidemiological assumptions 35

Appendix C4 Additional country exclusions 37

Appendix C5 Validation using predictions in excluded countries 38

Appendix C6 MCMC stability results 43

Appendix C7 Posterior predictive distributions 44

Appendix D Additional results 45

Appendix D1 Estimated R0 by country 45

Appendix D2 The individual effects of school and university closures 46

Appendix D3 Collinearity 46

Appendix D4 Correlations between effectiveness estimates 47

Appendix E Additional discussion of assumptions and limitations 49

Appendix E1 Limitations of the data 49

Appendix E2 Model limitations 49

Appendix F Overview of previous work 50

Appendix G Handling edge cases in the data collection 53

Appendix H References 57

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure A6 Model Overview Purple nodes are observed or have a fixed distribution We describe thediagram from bottom to top Each NPIrsquos effectiveness is characterised by αi which is independent ofthe country On each day a countryrsquos daily reproduction number Rt c only depends on that countryrsquosbase reproduction number R0c and the active NPIs (Φi t c ) Rt c is transformed to the daily growth rategt c which is used to compute the new infections N (C )

t c and N (D)t c that will turn into confirmed cases

and deaths respectively Finally the number of new confirmed cases Ct c and deaths Dt c is computedby convolution of N (middot)

t c with the respective delay distributions The same model structure is used forconfirmed cases and deaths The model combines both observations it splits all nodes above the dailygrowth rate gt c into separate branches for deaths and cases

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1 Themain difference is that we model both confirmed cases and deaths allowing us to leveragesignificantly more data Furthermore we do not assume a specific infection fatality rate(IFR) since we do not aim to infer the total number of COVID-19 infections The end of thissection details further adaptations which allow us to minimize assumptions about testingreporting and the IFR A list of all technical details is given in Appendix A2

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior (and hyperprior) distribution over R0c reflecting the wide disagree-ment of regional estimates of R02 We parameterize the effectiveness of NPI i assumed to

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

be same across countries and time with αi Each NPI is assumed to have an independentmultiplicative as on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We use a weakly informative symmetric prior αi simN (002) allowingfor both positive and negative effects because we presently cannot rule out that some NPIsdirectly or indirectly increase transmission

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c andRt c 3

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We assume that the generationinterval distribution is given by a gamma distribution with mean 667 days and standarddeviation 21 The mean is based on an Italian study4 which is deemed most relevant toEuropean countries5 and the standard deviation stems from a international set of countriessince in European countries it has only been estimated for the serial rather than generationinterval to our knowledge67 Using (A2) we can write g t c as g t c (Rt c ) (see Appendix A2)

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are backwards-inferred from the observation models for casesand deaths shown further below We assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)](A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)](A4)

aMany epidemiological models define growth rates as the exponent r in an exponential growth functionHere we use daily growth rates instead for ease of exposition These choices are mathematically equivalentNote that we adapted equation (29) in Wallinga amp Lipsitch3 to account for our choice

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross validation as no reference is available for it We did not tune any otheraspect of the modelmdashinstead we use choices from Flaxman et al1 or the most relevantavailable sources We evaluate five different values (σN isin 00501020304) fitting themodel on 35 countries each time and evaluating on a fixed randomly chosen validationset of 6 countries We select σN = 02 to maximize the log-likelihood on the validation setCross validation ensures a more calibrated model less likely to produce overconfident andunstable estimates10 The final model with σN = 02 is then evaluated on data from 20 held-out days in all countries which were not used to adjust any aspect of the model (FigureB7) We find that different values for σN produce very similar effectiveness estimates butlarger values lead to greater uncertainty and robustness (Figure C12 and Sharma et al9)

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation This delaydistribution is the sum of two independent gamma distributions the incubation periodand the delay from onset of symptoms to confirmation We use previously published andagreeing empirical distributions from China and Italy411ndash13 which sum up to a mean delayof 1035 days Finally the observed cases Ct c follow a negative binomial noise distributionwith mean C t c and an inferred dispersion parameter following Flaxman et al1

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ)

where PD (delay) is the distribution of the delay from infection to death It is also the sumof two independent gamma distributions the aforementioned incubation period and thedelay from onset of symptoms to death1114 which sum up to a mean delay of 229 daysFinally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and the same inferred dispersion parameter used for observed cases

bSince we treat new infections as a continuous number its initial value can (and often should) be between0 and 1

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Single and combined models To construct models which only use either confirmed cases ordeaths as observations we remove the variables corresponding to the disregarded observa-tions

Testing reporting and infection fatality rates Scaling all values of a time series by a con-stant does not change its growth rates The model is therefore invariant to the scale of theobservations and consequently to country-level differences in the IFR and the ascertainmentrate (the proportion of infected people who are subsequently reported positive) For exam-ple assume countries A and B differ only in their ascertainment rates Then our model willinfer a difference in N (C )

t c (Eq (A5)) but not in the growth rates g t c across A and B (Eq(A3)-(A4)) Accordingly the inferred NPI effectiveness will be identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas theintroduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimic changes in the ascertainment ratemdashnoise at time τ

affects all future casesmdashand allows for gradual multiplicative changes in the ascertainmentrate

We infer the unobserved variables in our model using Hamiltonian Monte-Carlo1516 (HMC)a standard MCMC sampling algorithm

Appendix A2 Technical Model Description

Variables are indexed by intervention i country c and day t All prior distributions areindependent

bull Data1 NPI Activations φi t c isin 012 Smoothed Observed Cases Ct c 3 Smoothed Observed Deaths D t c

bull Prior Distributions

cThis is only approximately true The negative binomial output distribution has a coefficient of variationdiminishing with its mean ie smaller observations are relatively more noisy and carry less weight Further-more whilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligibleeffect

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

1 Country-specific R0

R0c = Normal(R0κ) (A6)

R0 = 325 based on a meta analysis17 (A7)

κsim Half Normal(micro= 0σ= 05) (A8)

2 NPI Effectiveness

αi sim Normal(micro= 0σ=p02) (A9)

(A10)

3 Infection Initial Counts (uninformative priors)

N (C )0c = exp(ζ(C )

c ) (A11)

N (D)0c = exp(ζ(D)

c ) (A12)

ζ(C )c sim Normal(micro= 0σ= 50) (A13)

ζ(D)c sim Normal(micro= 0σ= 50) (A14)

(A15)

4 Observation Noise Dispersion Parameter

Ψsim Half Normal(micro= 0σ= 5) (A16)

bull Hyperparameters1 Infection Noise Scale σN = 02 (selected by cross-validation)

bull Epidemiological parameters1 Generation Interval Parameters The generation interval is assumed to have a

Gamma distribution with mean 667 and standard deviation 21 days4ndash6 This leadsto a distribution Gamma(α= 79β= 12)

2 Delay Distributions The time from infection to confirmation is assumed to bethe sum of the incubation period and the time taken from symptom onset to lab-oratory confirmation Therefore the time taken from infection to confirmationT (C ) is411ndash13

T (C ) sim Gamma(micro= 51σ

micro= 086)+Negative Binomial(micro= 525α= 157) (A17)

The time from infection to death is assumed to be the sum of the incubationperiod and the time taken from symptom onset to death Therefore the timetaken from infection to death T (D) is11114

T (D) sim Gamma(micro= 51σ

micro= 086)+Gamma(micro= 178

σ

micro= 045) (A18)

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 13: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

12345678NPI Median Effectiveness Ranking

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Global NPI Rank Stability

25

50

75

100

Figure 5 Ranking of NPIs by median effectiveness across all sensitivity analyses (15 sensitivity analyseswith a total of 96 experiment conditions) In each analysis NPI effects were estimated under severaldifferent plausible variations of the model or the data (Appendix B and Appendix C) The colourindicates in which fraction of all experiment conditions an NPI occupied a given rank The sensitivityanalyses aggregated in this Figure are shown in Figures B8 B9 C11 C12 C13 and B7A

(79ndash85) This finding is in strong agreement with the joint effect estimated in elevencountries by Flaxman et al1 and contributes to the mounting evidence that NPIs can beeffective at mitigating and suppressing outbreaks of COVID-19 Furthermore our resultssuggest that some NPIs outperform others While the exact effectiveness estimates varymildly the qualitative conclusions discussed below are robust across 15 sensitivity analyses

Business closures and gathering bans both seem effective at reducing COVID-19 transmis-sion Closing only high-risk businesses (mean reduction in R 31) appears only somewhatless effective than closing most nonessential businesses (40) making it the more promis-ing policy option in some circumstances Limiting gatherings to 10 people or less (36) wasmore effective than limits up to 100 (21) or 1000 people (2) This may reflect that smallgatherings are common As previously discussed we estimate the average additional effecteach NPI had in the contexts where it was implemented When countries introduced stay-at-home orders they nearly always also banned gatherings and closed schools universitiesand nonessential businesses if they had not done so already Flaxman et al1 and Hsianget al4 add the effect of these distinct NPIs to the effectiveness of stay-at-home orders andaccordingly find a large effect In contrast we and Banholzer et al3 isolate the additionaleffect of ordering the population to stay at home and instead find a smaller effect (18) Atypical country can reduce R to below 1 without a stay-at-home order (Figure 4) providedother NPIs stay active

Mandating mask-wearing in various public spaces had a small positive effect on averagein the countries we studied (2) This does not rule out that mask-wearing has a largereffect in other contexts In our data mask-wearing was only mandated when other NPIshad already reduced public interactions When most transmission occurs in private spaces

13

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

wearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier223 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 and open questionsabout different types of masks play a larger role potentially reducing effectiveness Whileour results cast doubt on reports that mask-wearing is the main determinant shaping acountryrsquos epidemic23 the policy still seems promising given all available evidence due toits comparatively low economic and social costs Its effectiveness may increase as otherNPIs are lifted and public interactions return

We find a surprisingly large effect for school and university closures an average 39 re-duction in R This finding is remarkably robust across different model structures variationsin the data and epidemiological assumptions (Figure B7) It remains robust when control-ling for NPIs excluded from our study and the onset time of major government intervention(Figures B9 and C10) Since school and university closures almost perfectly coincide inthe countries we study an approach such as ours cannot distinguish their individual effects(Appendix D2) Furthermore it cannot distinguish direct and indirect effects such asforcing parents to stay at home or causing broader behaviour changes by increasing publicconcern

Previous evidence on school and university closures is mixed1326 Early data suggest thatchildren and young adults are equally susceptible to infection but have a notably lowerobserved case rate than older adultsmdashwhether this is due to school and university closuresremains unknown27ndash29 Although infected young people are often asymptomatic they ap-pear to shed similar amounts of virus as older people3031 and might therefore circulatethe infection to higher-risk demographics unknowingly This also limits our ability to detectlarge outbreaks in educational facilities which closed in nearly all countries before suchdetection was feasible (with exceptions32) As outbreaks detected in UK schools are rapidlyincreasing33 this topic merits further study

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset5

and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B2) However this cannot give final certainty Investigating the role of unob-served effects is an important topic for future investigations Third our results cannot beused without qualification to predict the effect of lifting NPIs For example closing schoolsand universities seems to have greatly reduced transmission but this does not mean that re-opening them will cause infections to soar Educational institutions can (and do) implementsafety measures such as reduced class sizes Further work is needed to analyse the effectsof reopenings our collected data may be instrumental Fourth while we included more

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

NPIs than previous work (Table F4) several promising NPIs were excluded For exampletesting tracing and case isolation may be an important part of a cost-effective epidemicresponse34 but were not included because it is difficult to obtain comprehensive data Wediscuss further limitations in Appendix E

Currently governments across the world are seeking to keep R below 1 while minimising thesocial and economic costs of their interventions We hope that our results can guide policydecisions on which restrictions to lift and which NPIs to implement in any potential secondwave of infections Additionally our results show which areas of public life are most inneed of restructuring so that they can continue despite the pandemic However our resultsshould not be seen as the final answer on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies experimental trialssimulations and clinical experience

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

We thank Jacob Lagerros for operational support and for introducing some of the authorsto each other We thank Maksym Balatsko Marek Pukaj and Tomaacuteš Witzany for developingthe interactive website

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier MA Hartwick LChindelevitch designed the model and modelling experiments M Sharma AB StephensonT Gavenciak J Salvatier performed and analysed the modelling experiments J KulveitT Gavenciak JM Brauner conceived the research S Mindermann T Besiroglu J KulveitJM Brauner did the literature search JM Brauner S Mindermann M Sharma G LeechT Besiroglu V Mikulik wrote the manuscript All authors read and gave feedback on themanuscript and approved the final manuscript JM Brauner S Mindermann and M Sharmacontributed equally Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

3 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

4 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

5 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

6 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

7 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

8 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

9 Johns Hopkins University Center for Systems Science and Engineering COVID-19 DataRepository by the Center for Systems Science and Engineering (CSSE) at Johns HopkinsUniversity httpsgithubcomCSSEGISandDataCOVID-19 2020

10 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

11 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

12 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

13 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

14 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

15 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

19 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

20 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

28 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

29 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

30 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

31 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

32 Arnaud Fontanet et al ldquoCluster of COVID-19 in northern France A retrospective closedcohort studyrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041820071134

33 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 26 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile895356Weekly_COVID19_Surveillance_Report_w26pdf2020

34 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 21

Appendix A1 Detailed model description 21

Appendix A2 Technical Model Description 24

Appendix B Validation 28

Appendix B1 Unseen data 28

Appendix B2 Sensitivity Analysis 28

Appendix B3 Robustness to unobserved effects 31

Appendix C Additional sensitivity analyses and validation 33

Appendix C1 Role of NPI timing 33

Appendix C2 Sensitivity to data preprocessing 34

Appendix C3 Sensitivity to additional epidemiological assumptions 35

Appendix C4 Additional country exclusions 37

Appendix C5 Validation using predictions in excluded countries 38

Appendix C6 MCMC stability results 43

Appendix C7 Posterior predictive distributions 44

Appendix D Additional results 45

Appendix D1 Estimated R0 by country 45

Appendix D2 The individual effects of school and university closures 46

Appendix D3 Collinearity 46

Appendix D4 Correlations between effectiveness estimates 47

Appendix E Additional discussion of assumptions and limitations 49

Appendix E1 Limitations of the data 49

Appendix E2 Model limitations 49

Appendix F Overview of previous work 50

Appendix G Handling edge cases in the data collection 53

Appendix H References 57

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure A6 Model Overview Purple nodes are observed or have a fixed distribution We describe thediagram from bottom to top Each NPIrsquos effectiveness is characterised by αi which is independent ofthe country On each day a countryrsquos daily reproduction number Rt c only depends on that countryrsquosbase reproduction number R0c and the active NPIs (Φi t c ) Rt c is transformed to the daily growth rategt c which is used to compute the new infections N (C )

t c and N (D)t c that will turn into confirmed cases

and deaths respectively Finally the number of new confirmed cases Ct c and deaths Dt c is computedby convolution of N (middot)

t c with the respective delay distributions The same model structure is used forconfirmed cases and deaths The model combines both observations it splits all nodes above the dailygrowth rate gt c into separate branches for deaths and cases

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1 Themain difference is that we model both confirmed cases and deaths allowing us to leveragesignificantly more data Furthermore we do not assume a specific infection fatality rate(IFR) since we do not aim to infer the total number of COVID-19 infections The end of thissection details further adaptations which allow us to minimize assumptions about testingreporting and the IFR A list of all technical details is given in Appendix A2

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior (and hyperprior) distribution over R0c reflecting the wide disagree-ment of regional estimates of R02 We parameterize the effectiveness of NPI i assumed to

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

be same across countries and time with αi Each NPI is assumed to have an independentmultiplicative as on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We use a weakly informative symmetric prior αi simN (002) allowingfor both positive and negative effects because we presently cannot rule out that some NPIsdirectly or indirectly increase transmission

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c andRt c 3

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We assume that the generationinterval distribution is given by a gamma distribution with mean 667 days and standarddeviation 21 The mean is based on an Italian study4 which is deemed most relevant toEuropean countries5 and the standard deviation stems from a international set of countriessince in European countries it has only been estimated for the serial rather than generationinterval to our knowledge67 Using (A2) we can write g t c as g t c (Rt c ) (see Appendix A2)

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are backwards-inferred from the observation models for casesand deaths shown further below We assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)](A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)](A4)

aMany epidemiological models define growth rates as the exponent r in an exponential growth functionHere we use daily growth rates instead for ease of exposition These choices are mathematically equivalentNote that we adapted equation (29) in Wallinga amp Lipsitch3 to account for our choice

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross validation as no reference is available for it We did not tune any otheraspect of the modelmdashinstead we use choices from Flaxman et al1 or the most relevantavailable sources We evaluate five different values (σN isin 00501020304) fitting themodel on 35 countries each time and evaluating on a fixed randomly chosen validationset of 6 countries We select σN = 02 to maximize the log-likelihood on the validation setCross validation ensures a more calibrated model less likely to produce overconfident andunstable estimates10 The final model with σN = 02 is then evaluated on data from 20 held-out days in all countries which were not used to adjust any aspect of the model (FigureB7) We find that different values for σN produce very similar effectiveness estimates butlarger values lead to greater uncertainty and robustness (Figure C12 and Sharma et al9)

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation This delaydistribution is the sum of two independent gamma distributions the incubation periodand the delay from onset of symptoms to confirmation We use previously published andagreeing empirical distributions from China and Italy411ndash13 which sum up to a mean delayof 1035 days Finally the observed cases Ct c follow a negative binomial noise distributionwith mean C t c and an inferred dispersion parameter following Flaxman et al1

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ)

where PD (delay) is the distribution of the delay from infection to death It is also the sumof two independent gamma distributions the aforementioned incubation period and thedelay from onset of symptoms to death1114 which sum up to a mean delay of 229 daysFinally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and the same inferred dispersion parameter used for observed cases

bSince we treat new infections as a continuous number its initial value can (and often should) be between0 and 1

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Single and combined models To construct models which only use either confirmed cases ordeaths as observations we remove the variables corresponding to the disregarded observa-tions

Testing reporting and infection fatality rates Scaling all values of a time series by a con-stant does not change its growth rates The model is therefore invariant to the scale of theobservations and consequently to country-level differences in the IFR and the ascertainmentrate (the proportion of infected people who are subsequently reported positive) For exam-ple assume countries A and B differ only in their ascertainment rates Then our model willinfer a difference in N (C )

t c (Eq (A5)) but not in the growth rates g t c across A and B (Eq(A3)-(A4)) Accordingly the inferred NPI effectiveness will be identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas theintroduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimic changes in the ascertainment ratemdashnoise at time τ

affects all future casesmdashand allows for gradual multiplicative changes in the ascertainmentrate

We infer the unobserved variables in our model using Hamiltonian Monte-Carlo1516 (HMC)a standard MCMC sampling algorithm

Appendix A2 Technical Model Description

Variables are indexed by intervention i country c and day t All prior distributions areindependent

bull Data1 NPI Activations φi t c isin 012 Smoothed Observed Cases Ct c 3 Smoothed Observed Deaths D t c

bull Prior Distributions

cThis is only approximately true The negative binomial output distribution has a coefficient of variationdiminishing with its mean ie smaller observations are relatively more noisy and carry less weight Further-more whilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligibleeffect

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

1 Country-specific R0

R0c = Normal(R0κ) (A6)

R0 = 325 based on a meta analysis17 (A7)

κsim Half Normal(micro= 0σ= 05) (A8)

2 NPI Effectiveness

αi sim Normal(micro= 0σ=p02) (A9)

(A10)

3 Infection Initial Counts (uninformative priors)

N (C )0c = exp(ζ(C )

c ) (A11)

N (D)0c = exp(ζ(D)

c ) (A12)

ζ(C )c sim Normal(micro= 0σ= 50) (A13)

ζ(D)c sim Normal(micro= 0σ= 50) (A14)

(A15)

4 Observation Noise Dispersion Parameter

Ψsim Half Normal(micro= 0σ= 5) (A16)

bull Hyperparameters1 Infection Noise Scale σN = 02 (selected by cross-validation)

bull Epidemiological parameters1 Generation Interval Parameters The generation interval is assumed to have a

Gamma distribution with mean 667 and standard deviation 21 days4ndash6 This leadsto a distribution Gamma(α= 79β= 12)

2 Delay Distributions The time from infection to confirmation is assumed to bethe sum of the incubation period and the time taken from symptom onset to lab-oratory confirmation Therefore the time taken from infection to confirmationT (C ) is411ndash13

T (C ) sim Gamma(micro= 51σ

micro= 086)+Negative Binomial(micro= 525α= 157) (A17)

The time from infection to death is assumed to be the sum of the incubationperiod and the time taken from symptom onset to death Therefore the timetaken from infection to death T (D) is11114

T (D) sim Gamma(micro= 51σ

micro= 086)+Gamma(micro= 178

σ

micro= 045) (A18)

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 14: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

wearing masks in public is expected to be less effective This might explain why a largereffect was found in studies that included China and South Korea where mask-wearingwas introduced earlier223 While there is an emerging body of literature indicating thatmask-wearing can be effective in reducing transmission the bulk of evidence comes fromhealthcare settings24 In non-healthcare settings risk compensation25 and open questionsabout different types of masks play a larger role potentially reducing effectiveness Whileour results cast doubt on reports that mask-wearing is the main determinant shaping acountryrsquos epidemic23 the policy still seems promising given all available evidence due toits comparatively low economic and social costs Its effectiveness may increase as otherNPIs are lifted and public interactions return

We find a surprisingly large effect for school and university closures an average 39 re-duction in R This finding is remarkably robust across different model structures variationsin the data and epidemiological assumptions (Figure B7) It remains robust when control-ling for NPIs excluded from our study and the onset time of major government intervention(Figures B9 and C10) Since school and university closures almost perfectly coincide inthe countries we study an approach such as ours cannot distinguish their individual effects(Appendix D2) Furthermore it cannot distinguish direct and indirect effects such asforcing parents to stay at home or causing broader behaviour changes by increasing publicconcern

Previous evidence on school and university closures is mixed1326 Early data suggest thatchildren and young adults are equally susceptible to infection but have a notably lowerobserved case rate than older adultsmdashwhether this is due to school and university closuresremains unknown27ndash29 Although infected young people are often asymptomatic they ap-pear to shed similar amounts of virus as older people3031 and might therefore circulatethe infection to higher-risk demographics unknowingly This also limits our ability to detectlarge outbreaks in educational facilities which closed in nearly all countries before suchdetection was feasible (with exceptions32) As outbreaks detected in UK schools are rapidlyincreasing33 this topic merits further study

Our study has several limitations First NPI effectiveness may depend on the context ofimplementation such as the presence of other NPIs and country-specific factors Our esti-mates must be interpreted as the average effectiveness over the contexts in our dataset5

and expert judgement is required to adjust them to local circumstances Second R mayhave been reduced by unobserved NPIs or spontaneous behaviour changes To investigatewhether these reductions could be falsely attributed to the observed NPIs we perform sev-eral additional analyses and find that our results are stable to a range of unobserved effects(Appendix B2) However this cannot give final certainty Investigating the role of unob-served effects is an important topic for future investigations Third our results cannot beused without qualification to predict the effect of lifting NPIs For example closing schoolsand universities seems to have greatly reduced transmission but this does not mean that re-opening them will cause infections to soar Educational institutions can (and do) implementsafety measures such as reduced class sizes Further work is needed to analyse the effectsof reopenings our collected data may be instrumental Fourth while we included more

14

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

NPIs than previous work (Table F4) several promising NPIs were excluded For exampletesting tracing and case isolation may be an important part of a cost-effective epidemicresponse34 but were not included because it is difficult to obtain comprehensive data Wediscuss further limitations in Appendix E

Currently governments across the world are seeking to keep R below 1 while minimising thesocial and economic costs of their interventions We hope that our results can guide policydecisions on which restrictions to lift and which NPIs to implement in any potential secondwave of infections Additionally our results show which areas of public life are most inneed of restructuring so that they can continue despite the pandemic However our resultsshould not be seen as the final answer on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies experimental trialssimulations and clinical experience

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

We thank Jacob Lagerros for operational support and for introducing some of the authorsto each other We thank Maksym Balatsko Marek Pukaj and Tomaacuteš Witzany for developingthe interactive website

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier MA Hartwick LChindelevitch designed the model and modelling experiments M Sharma AB StephensonT Gavenciak J Salvatier performed and analysed the modelling experiments J KulveitT Gavenciak JM Brauner conceived the research S Mindermann T Besiroglu J KulveitJM Brauner did the literature search JM Brauner S Mindermann M Sharma G LeechT Besiroglu V Mikulik wrote the manuscript All authors read and gave feedback on themanuscript and approved the final manuscript JM Brauner S Mindermann and M Sharmacontributed equally Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

3 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

4 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

5 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

6 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

7 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

8 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

9 Johns Hopkins University Center for Systems Science and Engineering COVID-19 DataRepository by the Center for Systems Science and Engineering (CSSE) at Johns HopkinsUniversity httpsgithubcomCSSEGISandDataCOVID-19 2020

10 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

11 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

12 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

13 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

14 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

15 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

19 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

20 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

28 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

29 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

30 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

31 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

32 Arnaud Fontanet et al ldquoCluster of COVID-19 in northern France A retrospective closedcohort studyrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041820071134

33 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 26 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile895356Weekly_COVID19_Surveillance_Report_w26pdf2020

34 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 21

Appendix A1 Detailed model description 21

Appendix A2 Technical Model Description 24

Appendix B Validation 28

Appendix B1 Unseen data 28

Appendix B2 Sensitivity Analysis 28

Appendix B3 Robustness to unobserved effects 31

Appendix C Additional sensitivity analyses and validation 33

Appendix C1 Role of NPI timing 33

Appendix C2 Sensitivity to data preprocessing 34

Appendix C3 Sensitivity to additional epidemiological assumptions 35

Appendix C4 Additional country exclusions 37

Appendix C5 Validation using predictions in excluded countries 38

Appendix C6 MCMC stability results 43

Appendix C7 Posterior predictive distributions 44

Appendix D Additional results 45

Appendix D1 Estimated R0 by country 45

Appendix D2 The individual effects of school and university closures 46

Appendix D3 Collinearity 46

Appendix D4 Correlations between effectiveness estimates 47

Appendix E Additional discussion of assumptions and limitations 49

Appendix E1 Limitations of the data 49

Appendix E2 Model limitations 49

Appendix F Overview of previous work 50

Appendix G Handling edge cases in the data collection 53

Appendix H References 57

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure A6 Model Overview Purple nodes are observed or have a fixed distribution We describe thediagram from bottom to top Each NPIrsquos effectiveness is characterised by αi which is independent ofthe country On each day a countryrsquos daily reproduction number Rt c only depends on that countryrsquosbase reproduction number R0c and the active NPIs (Φi t c ) Rt c is transformed to the daily growth rategt c which is used to compute the new infections N (C )

t c and N (D)t c that will turn into confirmed cases

and deaths respectively Finally the number of new confirmed cases Ct c and deaths Dt c is computedby convolution of N (middot)

t c with the respective delay distributions The same model structure is used forconfirmed cases and deaths The model combines both observations it splits all nodes above the dailygrowth rate gt c into separate branches for deaths and cases

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1 Themain difference is that we model both confirmed cases and deaths allowing us to leveragesignificantly more data Furthermore we do not assume a specific infection fatality rate(IFR) since we do not aim to infer the total number of COVID-19 infections The end of thissection details further adaptations which allow us to minimize assumptions about testingreporting and the IFR A list of all technical details is given in Appendix A2

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior (and hyperprior) distribution over R0c reflecting the wide disagree-ment of regional estimates of R02 We parameterize the effectiveness of NPI i assumed to

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

be same across countries and time with αi Each NPI is assumed to have an independentmultiplicative as on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We use a weakly informative symmetric prior αi simN (002) allowingfor both positive and negative effects because we presently cannot rule out that some NPIsdirectly or indirectly increase transmission

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c andRt c 3

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We assume that the generationinterval distribution is given by a gamma distribution with mean 667 days and standarddeviation 21 The mean is based on an Italian study4 which is deemed most relevant toEuropean countries5 and the standard deviation stems from a international set of countriessince in European countries it has only been estimated for the serial rather than generationinterval to our knowledge67 Using (A2) we can write g t c as g t c (Rt c ) (see Appendix A2)

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are backwards-inferred from the observation models for casesand deaths shown further below We assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)](A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)](A4)

aMany epidemiological models define growth rates as the exponent r in an exponential growth functionHere we use daily growth rates instead for ease of exposition These choices are mathematically equivalentNote that we adapted equation (29) in Wallinga amp Lipsitch3 to account for our choice

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross validation as no reference is available for it We did not tune any otheraspect of the modelmdashinstead we use choices from Flaxman et al1 or the most relevantavailable sources We evaluate five different values (σN isin 00501020304) fitting themodel on 35 countries each time and evaluating on a fixed randomly chosen validationset of 6 countries We select σN = 02 to maximize the log-likelihood on the validation setCross validation ensures a more calibrated model less likely to produce overconfident andunstable estimates10 The final model with σN = 02 is then evaluated on data from 20 held-out days in all countries which were not used to adjust any aspect of the model (FigureB7) We find that different values for σN produce very similar effectiveness estimates butlarger values lead to greater uncertainty and robustness (Figure C12 and Sharma et al9)

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation This delaydistribution is the sum of two independent gamma distributions the incubation periodand the delay from onset of symptoms to confirmation We use previously published andagreeing empirical distributions from China and Italy411ndash13 which sum up to a mean delayof 1035 days Finally the observed cases Ct c follow a negative binomial noise distributionwith mean C t c and an inferred dispersion parameter following Flaxman et al1

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ)

where PD (delay) is the distribution of the delay from infection to death It is also the sumof two independent gamma distributions the aforementioned incubation period and thedelay from onset of symptoms to death1114 which sum up to a mean delay of 229 daysFinally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and the same inferred dispersion parameter used for observed cases

bSince we treat new infections as a continuous number its initial value can (and often should) be between0 and 1

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Single and combined models To construct models which only use either confirmed cases ordeaths as observations we remove the variables corresponding to the disregarded observa-tions

Testing reporting and infection fatality rates Scaling all values of a time series by a con-stant does not change its growth rates The model is therefore invariant to the scale of theobservations and consequently to country-level differences in the IFR and the ascertainmentrate (the proportion of infected people who are subsequently reported positive) For exam-ple assume countries A and B differ only in their ascertainment rates Then our model willinfer a difference in N (C )

t c (Eq (A5)) but not in the growth rates g t c across A and B (Eq(A3)-(A4)) Accordingly the inferred NPI effectiveness will be identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas theintroduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimic changes in the ascertainment ratemdashnoise at time τ

affects all future casesmdashand allows for gradual multiplicative changes in the ascertainmentrate

We infer the unobserved variables in our model using Hamiltonian Monte-Carlo1516 (HMC)a standard MCMC sampling algorithm

Appendix A2 Technical Model Description

Variables are indexed by intervention i country c and day t All prior distributions areindependent

bull Data1 NPI Activations φi t c isin 012 Smoothed Observed Cases Ct c 3 Smoothed Observed Deaths D t c

bull Prior Distributions

cThis is only approximately true The negative binomial output distribution has a coefficient of variationdiminishing with its mean ie smaller observations are relatively more noisy and carry less weight Further-more whilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligibleeffect

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

1 Country-specific R0

R0c = Normal(R0κ) (A6)

R0 = 325 based on a meta analysis17 (A7)

κsim Half Normal(micro= 0σ= 05) (A8)

2 NPI Effectiveness

αi sim Normal(micro= 0σ=p02) (A9)

(A10)

3 Infection Initial Counts (uninformative priors)

N (C )0c = exp(ζ(C )

c ) (A11)

N (D)0c = exp(ζ(D)

c ) (A12)

ζ(C )c sim Normal(micro= 0σ= 50) (A13)

ζ(D)c sim Normal(micro= 0σ= 50) (A14)

(A15)

4 Observation Noise Dispersion Parameter

Ψsim Half Normal(micro= 0σ= 5) (A16)

bull Hyperparameters1 Infection Noise Scale σN = 02 (selected by cross-validation)

bull Epidemiological parameters1 Generation Interval Parameters The generation interval is assumed to have a

Gamma distribution with mean 667 and standard deviation 21 days4ndash6 This leadsto a distribution Gamma(α= 79β= 12)

2 Delay Distributions The time from infection to confirmation is assumed to bethe sum of the incubation period and the time taken from symptom onset to lab-oratory confirmation Therefore the time taken from infection to confirmationT (C ) is411ndash13

T (C ) sim Gamma(micro= 51σ

micro= 086)+Negative Binomial(micro= 525α= 157) (A17)

The time from infection to death is assumed to be the sum of the incubationperiod and the time taken from symptom onset to death Therefore the timetaken from infection to death T (D) is11114

T (D) sim Gamma(micro= 51σ

micro= 086)+Gamma(micro= 178

σ

micro= 045) (A18)

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 15: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

NPIs than previous work (Table F4) several promising NPIs were excluded For exampletesting tracing and case isolation may be an important part of a cost-effective epidemicresponse34 but were not included because it is difficult to obtain comprehensive data Wediscuss further limitations in Appendix E

Currently governments across the world are seeking to keep R below 1 while minimising thesocial and economic costs of their interventions We hope that our results can guide policydecisions on which restrictions to lift and which NPIs to implement in any potential secondwave of infections Additionally our results show which areas of public life are most inneed of restructuring so that they can continue despite the pandemic However our resultsshould not be seen as the final answer on NPI effectiveness but rather as a contributionto a diverse body of evidence alongside other retrospective studies experimental trialssimulations and clinical experience

15

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

We thank Jacob Lagerros for operational support and for introducing some of the authorsto each other We thank Maksym Balatsko Marek Pukaj and Tomaacuteš Witzany for developingthe interactive website

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier MA Hartwick LChindelevitch designed the model and modelling experiments M Sharma AB StephensonT Gavenciak J Salvatier performed and analysed the modelling experiments J KulveitT Gavenciak JM Brauner conceived the research S Mindermann T Besiroglu J KulveitJM Brauner did the literature search JM Brauner S Mindermann M Sharma G LeechT Besiroglu V Mikulik wrote the manuscript All authors read and gave feedback on themanuscript and approved the final manuscript JM Brauner S Mindermann and M Sharmacontributed equally Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

3 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

4 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

5 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

6 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

7 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

8 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

9 Johns Hopkins University Center for Systems Science and Engineering COVID-19 DataRepository by the Center for Systems Science and Engineering (CSSE) at Johns HopkinsUniversity httpsgithubcomCSSEGISandDataCOVID-19 2020

10 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

11 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

12 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

13 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

14 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

15 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

19 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

20 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

28 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

29 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

30 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

31 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

32 Arnaud Fontanet et al ldquoCluster of COVID-19 in northern France A retrospective closedcohort studyrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041820071134

33 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 26 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile895356Weekly_COVID19_Surveillance_Report_w26pdf2020

34 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 21

Appendix A1 Detailed model description 21

Appendix A2 Technical Model Description 24

Appendix B Validation 28

Appendix B1 Unseen data 28

Appendix B2 Sensitivity Analysis 28

Appendix B3 Robustness to unobserved effects 31

Appendix C Additional sensitivity analyses and validation 33

Appendix C1 Role of NPI timing 33

Appendix C2 Sensitivity to data preprocessing 34

Appendix C3 Sensitivity to additional epidemiological assumptions 35

Appendix C4 Additional country exclusions 37

Appendix C5 Validation using predictions in excluded countries 38

Appendix C6 MCMC stability results 43

Appendix C7 Posterior predictive distributions 44

Appendix D Additional results 45

Appendix D1 Estimated R0 by country 45

Appendix D2 The individual effects of school and university closures 46

Appendix D3 Collinearity 46

Appendix D4 Correlations between effectiveness estimates 47

Appendix E Additional discussion of assumptions and limitations 49

Appendix E1 Limitations of the data 49

Appendix E2 Model limitations 49

Appendix F Overview of previous work 50

Appendix G Handling edge cases in the data collection 53

Appendix H References 57

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure A6 Model Overview Purple nodes are observed or have a fixed distribution We describe thediagram from bottom to top Each NPIrsquos effectiveness is characterised by αi which is independent ofthe country On each day a countryrsquos daily reproduction number Rt c only depends on that countryrsquosbase reproduction number R0c and the active NPIs (Φi t c ) Rt c is transformed to the daily growth rategt c which is used to compute the new infections N (C )

t c and N (D)t c that will turn into confirmed cases

and deaths respectively Finally the number of new confirmed cases Ct c and deaths Dt c is computedby convolution of N (middot)

t c with the respective delay distributions The same model structure is used forconfirmed cases and deaths The model combines both observations it splits all nodes above the dailygrowth rate gt c into separate branches for deaths and cases

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1 Themain difference is that we model both confirmed cases and deaths allowing us to leveragesignificantly more data Furthermore we do not assume a specific infection fatality rate(IFR) since we do not aim to infer the total number of COVID-19 infections The end of thissection details further adaptations which allow us to minimize assumptions about testingreporting and the IFR A list of all technical details is given in Appendix A2

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior (and hyperprior) distribution over R0c reflecting the wide disagree-ment of regional estimates of R02 We parameterize the effectiveness of NPI i assumed to

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

be same across countries and time with αi Each NPI is assumed to have an independentmultiplicative as on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We use a weakly informative symmetric prior αi simN (002) allowingfor both positive and negative effects because we presently cannot rule out that some NPIsdirectly or indirectly increase transmission

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c andRt c 3

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We assume that the generationinterval distribution is given by a gamma distribution with mean 667 days and standarddeviation 21 The mean is based on an Italian study4 which is deemed most relevant toEuropean countries5 and the standard deviation stems from a international set of countriessince in European countries it has only been estimated for the serial rather than generationinterval to our knowledge67 Using (A2) we can write g t c as g t c (Rt c ) (see Appendix A2)

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are backwards-inferred from the observation models for casesand deaths shown further below We assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)](A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)](A4)

aMany epidemiological models define growth rates as the exponent r in an exponential growth functionHere we use daily growth rates instead for ease of exposition These choices are mathematically equivalentNote that we adapted equation (29) in Wallinga amp Lipsitch3 to account for our choice

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross validation as no reference is available for it We did not tune any otheraspect of the modelmdashinstead we use choices from Flaxman et al1 or the most relevantavailable sources We evaluate five different values (σN isin 00501020304) fitting themodel on 35 countries each time and evaluating on a fixed randomly chosen validationset of 6 countries We select σN = 02 to maximize the log-likelihood on the validation setCross validation ensures a more calibrated model less likely to produce overconfident andunstable estimates10 The final model with σN = 02 is then evaluated on data from 20 held-out days in all countries which were not used to adjust any aspect of the model (FigureB7) We find that different values for σN produce very similar effectiveness estimates butlarger values lead to greater uncertainty and robustness (Figure C12 and Sharma et al9)

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation This delaydistribution is the sum of two independent gamma distributions the incubation periodand the delay from onset of symptoms to confirmation We use previously published andagreeing empirical distributions from China and Italy411ndash13 which sum up to a mean delayof 1035 days Finally the observed cases Ct c follow a negative binomial noise distributionwith mean C t c and an inferred dispersion parameter following Flaxman et al1

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ)

where PD (delay) is the distribution of the delay from infection to death It is also the sumof two independent gamma distributions the aforementioned incubation period and thedelay from onset of symptoms to death1114 which sum up to a mean delay of 229 daysFinally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and the same inferred dispersion parameter used for observed cases

bSince we treat new infections as a continuous number its initial value can (and often should) be between0 and 1

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Single and combined models To construct models which only use either confirmed cases ordeaths as observations we remove the variables corresponding to the disregarded observa-tions

Testing reporting and infection fatality rates Scaling all values of a time series by a con-stant does not change its growth rates The model is therefore invariant to the scale of theobservations and consequently to country-level differences in the IFR and the ascertainmentrate (the proportion of infected people who are subsequently reported positive) For exam-ple assume countries A and B differ only in their ascertainment rates Then our model willinfer a difference in N (C )

t c (Eq (A5)) but not in the growth rates g t c across A and B (Eq(A3)-(A4)) Accordingly the inferred NPI effectiveness will be identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas theintroduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimic changes in the ascertainment ratemdashnoise at time τ

affects all future casesmdashand allows for gradual multiplicative changes in the ascertainmentrate

We infer the unobserved variables in our model using Hamiltonian Monte-Carlo1516 (HMC)a standard MCMC sampling algorithm

Appendix A2 Technical Model Description

Variables are indexed by intervention i country c and day t All prior distributions areindependent

bull Data1 NPI Activations φi t c isin 012 Smoothed Observed Cases Ct c 3 Smoothed Observed Deaths D t c

bull Prior Distributions

cThis is only approximately true The negative binomial output distribution has a coefficient of variationdiminishing with its mean ie smaller observations are relatively more noisy and carry less weight Further-more whilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligibleeffect

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

1 Country-specific R0

R0c = Normal(R0κ) (A6)

R0 = 325 based on a meta analysis17 (A7)

κsim Half Normal(micro= 0σ= 05) (A8)

2 NPI Effectiveness

αi sim Normal(micro= 0σ=p02) (A9)

(A10)

3 Infection Initial Counts (uninformative priors)

N (C )0c = exp(ζ(C )

c ) (A11)

N (D)0c = exp(ζ(D)

c ) (A12)

ζ(C )c sim Normal(micro= 0σ= 50) (A13)

ζ(D)c sim Normal(micro= 0σ= 50) (A14)

(A15)

4 Observation Noise Dispersion Parameter

Ψsim Half Normal(micro= 0σ= 5) (A16)

bull Hyperparameters1 Infection Noise Scale σN = 02 (selected by cross-validation)

bull Epidemiological parameters1 Generation Interval Parameters The generation interval is assumed to have a

Gamma distribution with mean 667 and standard deviation 21 days4ndash6 This leadsto a distribution Gamma(α= 79β= 12)

2 Delay Distributions The time from infection to confirmation is assumed to bethe sum of the incubation period and the time taken from symptom onset to lab-oratory confirmation Therefore the time taken from infection to confirmationT (C ) is411ndash13

T (C ) sim Gamma(micro= 51σ

micro= 086)+Negative Binomial(micro= 525α= 157) (A17)

The time from infection to death is assumed to be the sum of the incubationperiod and the time taken from symptom onset to death Therefore the timetaken from infection to death T (D) is11114

T (D) sim Gamma(micro= 51σ

micro= 086)+Gamma(micro= 178

σ

micro= 045) (A18)

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 16: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Acknowledgements

Jan Brauner was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] and by Cancer Research UK Soumlren Minder-mannrsquos funding for graduate studies was from Oxford University and DeepMind MrinankSharma was supported by the EPSRC Centre for Doctoral Training in Autonomous Intel-ligent Machines and Systems [EPS0240501] Gavin Leech was supported by the UKRICentre for Doctoral Training in Interactive Artificial Intelligence [EPS0229371]

The paid contractor work in the data collection and the development of the interactivewebsite was funded by the Berkeley Existential Risk Initiative

We thank Jacob Lagerros for operational support and for introducing some of the authorsto each other We thank Maksym Balatsko Marek Pukaj and Tomaacuteš Witzany for developingthe interactive website

Declarations of interest

No conflicts of interests

Authorsrsquo contributions

D Johnston JM Brauner J Kulveit G Altman AJ Norman JT Monrad G Leech V Mikulikdesigned and conducted the NPI data collection S Mindermann M Sharma JM BraunerAB Stephenson H Ge YW Teh Y Gal J Kulveit T Gavenciak J Salvatier MA Hartwick LChindelevitch designed the model and modelling experiments M Sharma AB StephensonT Gavenciak J Salvatier performed and analysed the modelling experiments J KulveitT Gavenciak JM Brauner conceived the research S Mindermann T Besiroglu J KulveitJM Brauner did the literature search JM Brauner S Mindermann M Sharma G LeechT Besiroglu V Mikulik wrote the manuscript All authors read and gave feedback on themanuscript and approved the final manuscript JM Brauner S Mindermann and M Sharmacontributed equally Y Gal and J Kulveit contributed equally to senior authorship

Keywords

COVID-19 SARS-CoV-2 nonpharmaceutical intervention countermeasure Bayesian model

16

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

3 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

4 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

5 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

6 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

7 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

8 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

9 Johns Hopkins University Center for Systems Science and Engineering COVID-19 DataRepository by the Center for Systems Science and Engineering (CSSE) at Johns HopkinsUniversity httpsgithubcomCSSEGISandDataCOVID-19 2020

10 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

11 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

12 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

13 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

14 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

15 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

19 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

20 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

28 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

29 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

30 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

31 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

32 Arnaud Fontanet et al ldquoCluster of COVID-19 in northern France A retrospective closedcohort studyrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041820071134

33 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 26 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile895356Weekly_COVID19_Surveillance_Report_w26pdf2020

34 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 21

Appendix A1 Detailed model description 21

Appendix A2 Technical Model Description 24

Appendix B Validation 28

Appendix B1 Unseen data 28

Appendix B2 Sensitivity Analysis 28

Appendix B3 Robustness to unobserved effects 31

Appendix C Additional sensitivity analyses and validation 33

Appendix C1 Role of NPI timing 33

Appendix C2 Sensitivity to data preprocessing 34

Appendix C3 Sensitivity to additional epidemiological assumptions 35

Appendix C4 Additional country exclusions 37

Appendix C5 Validation using predictions in excluded countries 38

Appendix C6 MCMC stability results 43

Appendix C7 Posterior predictive distributions 44

Appendix D Additional results 45

Appendix D1 Estimated R0 by country 45

Appendix D2 The individual effects of school and university closures 46

Appendix D3 Collinearity 46

Appendix D4 Correlations between effectiveness estimates 47

Appendix E Additional discussion of assumptions and limitations 49

Appendix E1 Limitations of the data 49

Appendix E2 Model limitations 49

Appendix F Overview of previous work 50

Appendix G Handling edge cases in the data collection 53

Appendix H References 57

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure A6 Model Overview Purple nodes are observed or have a fixed distribution We describe thediagram from bottom to top Each NPIrsquos effectiveness is characterised by αi which is independent ofthe country On each day a countryrsquos daily reproduction number Rt c only depends on that countryrsquosbase reproduction number R0c and the active NPIs (Φi t c ) Rt c is transformed to the daily growth rategt c which is used to compute the new infections N (C )

t c and N (D)t c that will turn into confirmed cases

and deaths respectively Finally the number of new confirmed cases Ct c and deaths Dt c is computedby convolution of N (middot)

t c with the respective delay distributions The same model structure is used forconfirmed cases and deaths The model combines both observations it splits all nodes above the dailygrowth rate gt c into separate branches for deaths and cases

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1 Themain difference is that we model both confirmed cases and deaths allowing us to leveragesignificantly more data Furthermore we do not assume a specific infection fatality rate(IFR) since we do not aim to infer the total number of COVID-19 infections The end of thissection details further adaptations which allow us to minimize assumptions about testingreporting and the IFR A list of all technical details is given in Appendix A2

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior (and hyperprior) distribution over R0c reflecting the wide disagree-ment of regional estimates of R02 We parameterize the effectiveness of NPI i assumed to

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

be same across countries and time with αi Each NPI is assumed to have an independentmultiplicative as on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We use a weakly informative symmetric prior αi simN (002) allowingfor both positive and negative effects because we presently cannot rule out that some NPIsdirectly or indirectly increase transmission

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c andRt c 3

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We assume that the generationinterval distribution is given by a gamma distribution with mean 667 days and standarddeviation 21 The mean is based on an Italian study4 which is deemed most relevant toEuropean countries5 and the standard deviation stems from a international set of countriessince in European countries it has only been estimated for the serial rather than generationinterval to our knowledge67 Using (A2) we can write g t c as g t c (Rt c ) (see Appendix A2)

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are backwards-inferred from the observation models for casesand deaths shown further below We assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)](A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)](A4)

aMany epidemiological models define growth rates as the exponent r in an exponential growth functionHere we use daily growth rates instead for ease of exposition These choices are mathematically equivalentNote that we adapted equation (29) in Wallinga amp Lipsitch3 to account for our choice

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross validation as no reference is available for it We did not tune any otheraspect of the modelmdashinstead we use choices from Flaxman et al1 or the most relevantavailable sources We evaluate five different values (σN isin 00501020304) fitting themodel on 35 countries each time and evaluating on a fixed randomly chosen validationset of 6 countries We select σN = 02 to maximize the log-likelihood on the validation setCross validation ensures a more calibrated model less likely to produce overconfident andunstable estimates10 The final model with σN = 02 is then evaluated on data from 20 held-out days in all countries which were not used to adjust any aspect of the model (FigureB7) We find that different values for σN produce very similar effectiveness estimates butlarger values lead to greater uncertainty and robustness (Figure C12 and Sharma et al9)

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation This delaydistribution is the sum of two independent gamma distributions the incubation periodand the delay from onset of symptoms to confirmation We use previously published andagreeing empirical distributions from China and Italy411ndash13 which sum up to a mean delayof 1035 days Finally the observed cases Ct c follow a negative binomial noise distributionwith mean C t c and an inferred dispersion parameter following Flaxman et al1

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ)

where PD (delay) is the distribution of the delay from infection to death It is also the sumof two independent gamma distributions the aforementioned incubation period and thedelay from onset of symptoms to death1114 which sum up to a mean delay of 229 daysFinally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and the same inferred dispersion parameter used for observed cases

bSince we treat new infections as a continuous number its initial value can (and often should) be between0 and 1

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Single and combined models To construct models which only use either confirmed cases ordeaths as observations we remove the variables corresponding to the disregarded observa-tions

Testing reporting and infection fatality rates Scaling all values of a time series by a con-stant does not change its growth rates The model is therefore invariant to the scale of theobservations and consequently to country-level differences in the IFR and the ascertainmentrate (the proportion of infected people who are subsequently reported positive) For exam-ple assume countries A and B differ only in their ascertainment rates Then our model willinfer a difference in N (C )

t c (Eq (A5)) but not in the growth rates g t c across A and B (Eq(A3)-(A4)) Accordingly the inferred NPI effectiveness will be identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas theintroduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimic changes in the ascertainment ratemdashnoise at time τ

affects all future casesmdashand allows for gradual multiplicative changes in the ascertainmentrate

We infer the unobserved variables in our model using Hamiltonian Monte-Carlo1516 (HMC)a standard MCMC sampling algorithm

Appendix A2 Technical Model Description

Variables are indexed by intervention i country c and day t All prior distributions areindependent

bull Data1 NPI Activations φi t c isin 012 Smoothed Observed Cases Ct c 3 Smoothed Observed Deaths D t c

bull Prior Distributions

cThis is only approximately true The negative binomial output distribution has a coefficient of variationdiminishing with its mean ie smaller observations are relatively more noisy and carry less weight Further-more whilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligibleeffect

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

1 Country-specific R0

R0c = Normal(R0κ) (A6)

R0 = 325 based on a meta analysis17 (A7)

κsim Half Normal(micro= 0σ= 05) (A8)

2 NPI Effectiveness

αi sim Normal(micro= 0σ=p02) (A9)

(A10)

3 Infection Initial Counts (uninformative priors)

N (C )0c = exp(ζ(C )

c ) (A11)

N (D)0c = exp(ζ(D)

c ) (A12)

ζ(C )c sim Normal(micro= 0σ= 50) (A13)

ζ(D)c sim Normal(micro= 0σ= 50) (A14)

(A15)

4 Observation Noise Dispersion Parameter

Ψsim Half Normal(micro= 0σ= 5) (A16)

bull Hyperparameters1 Infection Noise Scale σN = 02 (selected by cross-validation)

bull Epidemiological parameters1 Generation Interval Parameters The generation interval is assumed to have a

Gamma distribution with mean 667 and standard deviation 21 days4ndash6 This leadsto a distribution Gamma(α= 79β= 12)

2 Delay Distributions The time from infection to confirmation is assumed to bethe sum of the incubation period and the time taken from symptom onset to lab-oratory confirmation Therefore the time taken from infection to confirmationT (C ) is411ndash13

T (C ) sim Gamma(micro= 51σ

micro= 086)+Negative Binomial(micro= 525α= 157) (A17)

The time from infection to death is assumed to be the sum of the incubationperiod and the time taken from symptom onset to death Therefore the timetaken from infection to death T (D) is11114

T (D) sim Gamma(micro= 51σ

micro= 086)+Gamma(micro= 178

σ

micro= 045) (A18)

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 17: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

3 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

4 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

5 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

6 Kristian Soltesz et al ldquoOn the sensitivity of non-pharmaceutical intervention modelsfor SARS-CoV-2 spread estimationrdquo In medRxiv (2020)

7 Cindy Cheng et al ldquoCOVID-19 Government Response Event Dataset (CoronaNet v10)rdquo In Nature Human Behaviour (2020) pp 1ndash13

8 Oxford Covid Government Response Tracker July 2020 URL httpsgithubcomOxCGRTcovid-policy-tracker

9 Johns Hopkins University Center for Systems Science and Engineering COVID-19 DataRepository by the Center for Systems Science and Engineering (CSSE) at Johns HopkinsUniversity httpsgithubcomCSSEGISandDataCOVID-19 2020

10 Epidemic Forecasting Global NPI Database httpepidemicforecastingorgdatasets2020

11 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

12 Mask4All What Countries Require Masks in Public or Recommend Masks https masks4all co what - countries - require - masks - in - public (Accessed on05242020)

13 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

14 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

15 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

17

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

19 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

20 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

28 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

29 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

30 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

31 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

32 Arnaud Fontanet et al ldquoCluster of COVID-19 in northern France A retrospective closedcohort studyrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041820071134

33 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 26 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile895356Weekly_COVID19_Surveillance_Report_w26pdf2020

34 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 21

Appendix A1 Detailed model description 21

Appendix A2 Technical Model Description 24

Appendix B Validation 28

Appendix B1 Unseen data 28

Appendix B2 Sensitivity Analysis 28

Appendix B3 Robustness to unobserved effects 31

Appendix C Additional sensitivity analyses and validation 33

Appendix C1 Role of NPI timing 33

Appendix C2 Sensitivity to data preprocessing 34

Appendix C3 Sensitivity to additional epidemiological assumptions 35

Appendix C4 Additional country exclusions 37

Appendix C5 Validation using predictions in excluded countries 38

Appendix C6 MCMC stability results 43

Appendix C7 Posterior predictive distributions 44

Appendix D Additional results 45

Appendix D1 Estimated R0 by country 45

Appendix D2 The individual effects of school and university closures 46

Appendix D3 Collinearity 46

Appendix D4 Correlations between effectiveness estimates 47

Appendix E Additional discussion of assumptions and limitations 49

Appendix E1 Limitations of the data 49

Appendix E2 Model limitations 49

Appendix F Overview of previous work 50

Appendix G Handling edge cases in the data collection 53

Appendix H References 57

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure A6 Model Overview Purple nodes are observed or have a fixed distribution We describe thediagram from bottom to top Each NPIrsquos effectiveness is characterised by αi which is independent ofthe country On each day a countryrsquos daily reproduction number Rt c only depends on that countryrsquosbase reproduction number R0c and the active NPIs (Φi t c ) Rt c is transformed to the daily growth rategt c which is used to compute the new infections N (C )

t c and N (D)t c that will turn into confirmed cases

and deaths respectively Finally the number of new confirmed cases Ct c and deaths Dt c is computedby convolution of N (middot)

t c with the respective delay distributions The same model structure is used forconfirmed cases and deaths The model combines both observations it splits all nodes above the dailygrowth rate gt c into separate branches for deaths and cases

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1 Themain difference is that we model both confirmed cases and deaths allowing us to leveragesignificantly more data Furthermore we do not assume a specific infection fatality rate(IFR) since we do not aim to infer the total number of COVID-19 infections The end of thissection details further adaptations which allow us to minimize assumptions about testingreporting and the IFR A list of all technical details is given in Appendix A2

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior (and hyperprior) distribution over R0c reflecting the wide disagree-ment of regional estimates of R02 We parameterize the effectiveness of NPI i assumed to

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

be same across countries and time with αi Each NPI is assumed to have an independentmultiplicative as on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We use a weakly informative symmetric prior αi simN (002) allowingfor both positive and negative effects because we presently cannot rule out that some NPIsdirectly or indirectly increase transmission

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c andRt c 3

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We assume that the generationinterval distribution is given by a gamma distribution with mean 667 days and standarddeviation 21 The mean is based on an Italian study4 which is deemed most relevant toEuropean countries5 and the standard deviation stems from a international set of countriessince in European countries it has only been estimated for the serial rather than generationinterval to our knowledge67 Using (A2) we can write g t c as g t c (Rt c ) (see Appendix A2)

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are backwards-inferred from the observation models for casesand deaths shown further below We assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)](A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)](A4)

aMany epidemiological models define growth rates as the exponent r in an exponential growth functionHere we use daily growth rates instead for ease of exposition These choices are mathematically equivalentNote that we adapted equation (29) in Wallinga amp Lipsitch3 to account for our choice

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross validation as no reference is available for it We did not tune any otheraspect of the modelmdashinstead we use choices from Flaxman et al1 or the most relevantavailable sources We evaluate five different values (σN isin 00501020304) fitting themodel on 35 countries each time and evaluating on a fixed randomly chosen validationset of 6 countries We select σN = 02 to maximize the log-likelihood on the validation setCross validation ensures a more calibrated model less likely to produce overconfident andunstable estimates10 The final model with σN = 02 is then evaluated on data from 20 held-out days in all countries which were not used to adjust any aspect of the model (FigureB7) We find that different values for σN produce very similar effectiveness estimates butlarger values lead to greater uncertainty and robustness (Figure C12 and Sharma et al9)

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation This delaydistribution is the sum of two independent gamma distributions the incubation periodand the delay from onset of symptoms to confirmation We use previously published andagreeing empirical distributions from China and Italy411ndash13 which sum up to a mean delayof 1035 days Finally the observed cases Ct c follow a negative binomial noise distributionwith mean C t c and an inferred dispersion parameter following Flaxman et al1

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ)

where PD (delay) is the distribution of the delay from infection to death It is also the sumof two independent gamma distributions the aforementioned incubation period and thedelay from onset of symptoms to death1114 which sum up to a mean delay of 229 daysFinally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and the same inferred dispersion parameter used for observed cases

bSince we treat new infections as a continuous number its initial value can (and often should) be between0 and 1

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Single and combined models To construct models which only use either confirmed cases ordeaths as observations we remove the variables corresponding to the disregarded observa-tions

Testing reporting and infection fatality rates Scaling all values of a time series by a con-stant does not change its growth rates The model is therefore invariant to the scale of theobservations and consequently to country-level differences in the IFR and the ascertainmentrate (the proportion of infected people who are subsequently reported positive) For exam-ple assume countries A and B differ only in their ascertainment rates Then our model willinfer a difference in N (C )

t c (Eq (A5)) but not in the growth rates g t c across A and B (Eq(A3)-(A4)) Accordingly the inferred NPI effectiveness will be identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas theintroduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimic changes in the ascertainment ratemdashnoise at time τ

affects all future casesmdashand allows for gradual multiplicative changes in the ascertainmentrate

We infer the unobserved variables in our model using Hamiltonian Monte-Carlo1516 (HMC)a standard MCMC sampling algorithm

Appendix A2 Technical Model Description

Variables are indexed by intervention i country c and day t All prior distributions areindependent

bull Data1 NPI Activations φi t c isin 012 Smoothed Observed Cases Ct c 3 Smoothed Observed Deaths D t c

bull Prior Distributions

cThis is only approximately true The negative binomial output distribution has a coefficient of variationdiminishing with its mean ie smaller observations are relatively more noisy and carry less weight Further-more whilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligibleeffect

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

1 Country-specific R0

R0c = Normal(R0κ) (A6)

R0 = 325 based on a meta analysis17 (A7)

κsim Half Normal(micro= 0σ= 05) (A8)

2 NPI Effectiveness

αi sim Normal(micro= 0σ=p02) (A9)

(A10)

3 Infection Initial Counts (uninformative priors)

N (C )0c = exp(ζ(C )

c ) (A11)

N (D)0c = exp(ζ(D)

c ) (A12)

ζ(C )c sim Normal(micro= 0σ= 50) (A13)

ζ(D)c sim Normal(micro= 0σ= 50) (A14)

(A15)

4 Observation Noise Dispersion Parameter

Ψsim Half Normal(micro= 0σ= 5) (A16)

bull Hyperparameters1 Infection Noise Scale σN = 02 (selected by cross-validation)

bull Epidemiological parameters1 Generation Interval Parameters The generation interval is assumed to have a

Gamma distribution with mean 667 and standard deviation 21 days4ndash6 This leadsto a distribution Gamma(α= 79β= 12)

2 Delay Distributions The time from infection to confirmation is assumed to bethe sum of the incubation period and the time taken from symptom onset to lab-oratory confirmation Therefore the time taken from infection to confirmationT (C ) is411ndash13

T (C ) sim Gamma(micro= 51σ

micro= 086)+Negative Binomial(micro= 525α= 157) (A17)

The time from infection to death is assumed to be the sum of the incubationperiod and the time taken from symptom onset to death Therefore the timetaken from infection to death T (D) is11114

T (D) sim Gamma(micro= 51σ

micro= 086)+Gamma(micro= 178

σ

micro= 045) (A18)

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 18: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

16 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

17 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

18 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

19 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

20 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

21 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

22 Christopher Winship and Bruce Western ldquoMulticollinearity and model misspecifica-tionrdquo In Sociological Science 327 (2016) pp 627ndash649

23 Renyi Zhang et al ldquoIdentifying airborne transmission as the dominant route for thespread of COVID-19rdquo In Proceedings of the National Academy of Sciences 11726 (June2020) pp 14857ndash14863 ISSN 0027-8424 1091-6490 DOI 101073pnas2009637117

24 Derek K Chu et al ldquoPhysical distancing face masks and eye protection to preventperson-to-person transmission of SARS-CoV-2 and COVID-19 a systematic review andmeta-analysisrdquo In The Lancet (2020)

25 Graham P Martin Esmeacutee Hanna and Robert Dingwall ldquoUrgency and uncertaintycovid-19 face masks and evidence informed policyrdquo In BMJ 369 (2020)

26 Juanjuan Zhang et al ldquoChanges in contact patterns shape the dynamics of the COVID-19 outbreak in Chinardquo In Science (2020)

27 Nisha S Mehta et al ldquoSARS-CoV-2 (COVID-19) What do we know about childrenA systematic reviewrdquo In Clinical Infectious Diseases (May 2020) DOI 101093cidciaa556

28 Petra Zimmermann and Nigel Curtis ldquoCoronavirus Infections in Children IncludingCOVID-19rdquo In The Pediatric Infectious Disease Journal 395 (May 2020) pp 355ndash368DOI 101097inf0000000000002660

29 When Should a School Reopen Final Report httpwwwindependentsageorgwp-contentuploads202005Independent-Sage-Brief-Report-on-Schools-5pdf(Accessed on 05282020) May 2020

18

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

30 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

31 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

32 Arnaud Fontanet et al ldquoCluster of COVID-19 in northern France A retrospective closedcohort studyrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041820071134

33 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 26 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile895356Weekly_COVID19_Surveillance_Report_w26pdf2020

34 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 21

Appendix A1 Detailed model description 21

Appendix A2 Technical Model Description 24

Appendix B Validation 28

Appendix B1 Unseen data 28

Appendix B2 Sensitivity Analysis 28

Appendix B3 Robustness to unobserved effects 31

Appendix C Additional sensitivity analyses and validation 33

Appendix C1 Role of NPI timing 33

Appendix C2 Sensitivity to data preprocessing 34

Appendix C3 Sensitivity to additional epidemiological assumptions 35

Appendix C4 Additional country exclusions 37

Appendix C5 Validation using predictions in excluded countries 38

Appendix C6 MCMC stability results 43

Appendix C7 Posterior predictive distributions 44

Appendix D Additional results 45

Appendix D1 Estimated R0 by country 45

Appendix D2 The individual effects of school and university closures 46

Appendix D3 Collinearity 46

Appendix D4 Correlations between effectiveness estimates 47

Appendix E Additional discussion of assumptions and limitations 49

Appendix E1 Limitations of the data 49

Appendix E2 Model limitations 49

Appendix F Overview of previous work 50

Appendix G Handling edge cases in the data collection 53

Appendix H References 57

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure A6 Model Overview Purple nodes are observed or have a fixed distribution We describe thediagram from bottom to top Each NPIrsquos effectiveness is characterised by αi which is independent ofthe country On each day a countryrsquos daily reproduction number Rt c only depends on that countryrsquosbase reproduction number R0c and the active NPIs (Φi t c ) Rt c is transformed to the daily growth rategt c which is used to compute the new infections N (C )

t c and N (D)t c that will turn into confirmed cases

and deaths respectively Finally the number of new confirmed cases Ct c and deaths Dt c is computedby convolution of N (middot)

t c with the respective delay distributions The same model structure is used forconfirmed cases and deaths The model combines both observations it splits all nodes above the dailygrowth rate gt c into separate branches for deaths and cases

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1 Themain difference is that we model both confirmed cases and deaths allowing us to leveragesignificantly more data Furthermore we do not assume a specific infection fatality rate(IFR) since we do not aim to infer the total number of COVID-19 infections The end of thissection details further adaptations which allow us to minimize assumptions about testingreporting and the IFR A list of all technical details is given in Appendix A2

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior (and hyperprior) distribution over R0c reflecting the wide disagree-ment of regional estimates of R02 We parameterize the effectiveness of NPI i assumed to

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

be same across countries and time with αi Each NPI is assumed to have an independentmultiplicative as on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We use a weakly informative symmetric prior αi simN (002) allowingfor both positive and negative effects because we presently cannot rule out that some NPIsdirectly or indirectly increase transmission

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c andRt c 3

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We assume that the generationinterval distribution is given by a gamma distribution with mean 667 days and standarddeviation 21 The mean is based on an Italian study4 which is deemed most relevant toEuropean countries5 and the standard deviation stems from a international set of countriessince in European countries it has only been estimated for the serial rather than generationinterval to our knowledge67 Using (A2) we can write g t c as g t c (Rt c ) (see Appendix A2)

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are backwards-inferred from the observation models for casesand deaths shown further below We assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)](A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)](A4)

aMany epidemiological models define growth rates as the exponent r in an exponential growth functionHere we use daily growth rates instead for ease of exposition These choices are mathematically equivalentNote that we adapted equation (29) in Wallinga amp Lipsitch3 to account for our choice

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross validation as no reference is available for it We did not tune any otheraspect of the modelmdashinstead we use choices from Flaxman et al1 or the most relevantavailable sources We evaluate five different values (σN isin 00501020304) fitting themodel on 35 countries each time and evaluating on a fixed randomly chosen validationset of 6 countries We select σN = 02 to maximize the log-likelihood on the validation setCross validation ensures a more calibrated model less likely to produce overconfident andunstable estimates10 The final model with σN = 02 is then evaluated on data from 20 held-out days in all countries which were not used to adjust any aspect of the model (FigureB7) We find that different values for σN produce very similar effectiveness estimates butlarger values lead to greater uncertainty and robustness (Figure C12 and Sharma et al9)

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation This delaydistribution is the sum of two independent gamma distributions the incubation periodand the delay from onset of symptoms to confirmation We use previously published andagreeing empirical distributions from China and Italy411ndash13 which sum up to a mean delayof 1035 days Finally the observed cases Ct c follow a negative binomial noise distributionwith mean C t c and an inferred dispersion parameter following Flaxman et al1

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ)

where PD (delay) is the distribution of the delay from infection to death It is also the sumof two independent gamma distributions the aforementioned incubation period and thedelay from onset of symptoms to death1114 which sum up to a mean delay of 229 daysFinally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and the same inferred dispersion parameter used for observed cases

bSince we treat new infections as a continuous number its initial value can (and often should) be between0 and 1

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Single and combined models To construct models which only use either confirmed cases ordeaths as observations we remove the variables corresponding to the disregarded observa-tions

Testing reporting and infection fatality rates Scaling all values of a time series by a con-stant does not change its growth rates The model is therefore invariant to the scale of theobservations and consequently to country-level differences in the IFR and the ascertainmentrate (the proportion of infected people who are subsequently reported positive) For exam-ple assume countries A and B differ only in their ascertainment rates Then our model willinfer a difference in N (C )

t c (Eq (A5)) but not in the growth rates g t c across A and B (Eq(A3)-(A4)) Accordingly the inferred NPI effectiveness will be identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas theintroduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimic changes in the ascertainment ratemdashnoise at time τ

affects all future casesmdashand allows for gradual multiplicative changes in the ascertainmentrate

We infer the unobserved variables in our model using Hamiltonian Monte-Carlo1516 (HMC)a standard MCMC sampling algorithm

Appendix A2 Technical Model Description

Variables are indexed by intervention i country c and day t All prior distributions areindependent

bull Data1 NPI Activations φi t c isin 012 Smoothed Observed Cases Ct c 3 Smoothed Observed Deaths D t c

bull Prior Distributions

cThis is only approximately true The negative binomial output distribution has a coefficient of variationdiminishing with its mean ie smaller observations are relatively more noisy and carry less weight Further-more whilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligibleeffect

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

1 Country-specific R0

R0c = Normal(R0κ) (A6)

R0 = 325 based on a meta analysis17 (A7)

κsim Half Normal(micro= 0σ= 05) (A8)

2 NPI Effectiveness

αi sim Normal(micro= 0σ=p02) (A9)

(A10)

3 Infection Initial Counts (uninformative priors)

N (C )0c = exp(ζ(C )

c ) (A11)

N (D)0c = exp(ζ(D)

c ) (A12)

ζ(C )c sim Normal(micro= 0σ= 50) (A13)

ζ(D)c sim Normal(micro= 0σ= 50) (A14)

(A15)

4 Observation Noise Dispersion Parameter

Ψsim Half Normal(micro= 0σ= 5) (A16)

bull Hyperparameters1 Infection Noise Scale σN = 02 (selected by cross-validation)

bull Epidemiological parameters1 Generation Interval Parameters The generation interval is assumed to have a

Gamma distribution with mean 667 and standard deviation 21 days4ndash6 This leadsto a distribution Gamma(α= 79β= 12)

2 Delay Distributions The time from infection to confirmation is assumed to bethe sum of the incubation period and the time taken from symptom onset to lab-oratory confirmation Therefore the time taken from infection to confirmationT (C ) is411ndash13

T (C ) sim Gamma(micro= 51σ

micro= 086)+Negative Binomial(micro= 525α= 157) (A17)

The time from infection to death is assumed to be the sum of the incubationperiod and the time taken from symptom onset to death Therefore the timetaken from infection to death T (D) is11114

T (D) sim Gamma(micro= 51σ

micro= 086)+Gamma(micro= 178

σ

micro= 045) (A18)

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 19: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

30 Terry C Jones et al ldquoAn analysis of SARS-CoV-2 viral load by patient agerdquo 2020

31 Arnaud G LrsquoHuillier et al ldquoShedding of infectious SARS-CoV-2 in symptomatic neonateschildren and adolescentsrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv(May 2020) DOI 1011012020042720076778

32 Arnaud Fontanet et al ldquoCluster of COVID-19 in northern France A retrospective closedcohort studyrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041820071134

33 Weekly Coronavirus Disease 2019 (COVID-19) Surveillance Report - week 26 httpsassetspublishingservicegovukgovernmentuploadssystemuploadsattachment_datafile895356Weekly_COVID19_Surveillance_Report_w26pdf2020

34 Tim Colbourn et al Modelling the Health and Economic Impacts of Population-Wide Test-ing Contact Tracing and Isolation (PTTI) Strategies for COVID-19 in the UK ID 3627273June 2020 DOI 102139ssrn3627273 URL httpspapersssrncomabstract=3627273

19

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix

Table of ContentsAppendix A Modelling details 21

Appendix A1 Detailed model description 21

Appendix A2 Technical Model Description 24

Appendix B Validation 28

Appendix B1 Unseen data 28

Appendix B2 Sensitivity Analysis 28

Appendix B3 Robustness to unobserved effects 31

Appendix C Additional sensitivity analyses and validation 33

Appendix C1 Role of NPI timing 33

Appendix C2 Sensitivity to data preprocessing 34

Appendix C3 Sensitivity to additional epidemiological assumptions 35

Appendix C4 Additional country exclusions 37

Appendix C5 Validation using predictions in excluded countries 38

Appendix C6 MCMC stability results 43

Appendix C7 Posterior predictive distributions 44

Appendix D Additional results 45

Appendix D1 Estimated R0 by country 45

Appendix D2 The individual effects of school and university closures 46

Appendix D3 Collinearity 46

Appendix D4 Correlations between effectiveness estimates 47

Appendix E Additional discussion of assumptions and limitations 49

Appendix E1 Limitations of the data 49

Appendix E2 Model limitations 49

Appendix F Overview of previous work 50

Appendix G Handling edge cases in the data collection 53

Appendix H References 57

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure A6 Model Overview Purple nodes are observed or have a fixed distribution We describe thediagram from bottom to top Each NPIrsquos effectiveness is characterised by αi which is independent ofthe country On each day a countryrsquos daily reproduction number Rt c only depends on that countryrsquosbase reproduction number R0c and the active NPIs (Φi t c ) Rt c is transformed to the daily growth rategt c which is used to compute the new infections N (C )

t c and N (D)t c that will turn into confirmed cases

and deaths respectively Finally the number of new confirmed cases Ct c and deaths Dt c is computedby convolution of N (middot)

t c with the respective delay distributions The same model structure is used forconfirmed cases and deaths The model combines both observations it splits all nodes above the dailygrowth rate gt c into separate branches for deaths and cases

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1 Themain difference is that we model both confirmed cases and deaths allowing us to leveragesignificantly more data Furthermore we do not assume a specific infection fatality rate(IFR) since we do not aim to infer the total number of COVID-19 infections The end of thissection details further adaptations which allow us to minimize assumptions about testingreporting and the IFR A list of all technical details is given in Appendix A2

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior (and hyperprior) distribution over R0c reflecting the wide disagree-ment of regional estimates of R02 We parameterize the effectiveness of NPI i assumed to

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

be same across countries and time with αi Each NPI is assumed to have an independentmultiplicative as on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We use a weakly informative symmetric prior αi simN (002) allowingfor both positive and negative effects because we presently cannot rule out that some NPIsdirectly or indirectly increase transmission

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c andRt c 3

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We assume that the generationinterval distribution is given by a gamma distribution with mean 667 days and standarddeviation 21 The mean is based on an Italian study4 which is deemed most relevant toEuropean countries5 and the standard deviation stems from a international set of countriessince in European countries it has only been estimated for the serial rather than generationinterval to our knowledge67 Using (A2) we can write g t c as g t c (Rt c ) (see Appendix A2)

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are backwards-inferred from the observation models for casesand deaths shown further below We assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)](A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)](A4)

aMany epidemiological models define growth rates as the exponent r in an exponential growth functionHere we use daily growth rates instead for ease of exposition These choices are mathematically equivalentNote that we adapted equation (29) in Wallinga amp Lipsitch3 to account for our choice

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross validation as no reference is available for it We did not tune any otheraspect of the modelmdashinstead we use choices from Flaxman et al1 or the most relevantavailable sources We evaluate five different values (σN isin 00501020304) fitting themodel on 35 countries each time and evaluating on a fixed randomly chosen validationset of 6 countries We select σN = 02 to maximize the log-likelihood on the validation setCross validation ensures a more calibrated model less likely to produce overconfident andunstable estimates10 The final model with σN = 02 is then evaluated on data from 20 held-out days in all countries which were not used to adjust any aspect of the model (FigureB7) We find that different values for σN produce very similar effectiveness estimates butlarger values lead to greater uncertainty and robustness (Figure C12 and Sharma et al9)

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation This delaydistribution is the sum of two independent gamma distributions the incubation periodand the delay from onset of symptoms to confirmation We use previously published andagreeing empirical distributions from China and Italy411ndash13 which sum up to a mean delayof 1035 days Finally the observed cases Ct c follow a negative binomial noise distributionwith mean C t c and an inferred dispersion parameter following Flaxman et al1

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ)

where PD (delay) is the distribution of the delay from infection to death It is also the sumof two independent gamma distributions the aforementioned incubation period and thedelay from onset of symptoms to death1114 which sum up to a mean delay of 229 daysFinally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and the same inferred dispersion parameter used for observed cases

bSince we treat new infections as a continuous number its initial value can (and often should) be between0 and 1

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Single and combined models To construct models which only use either confirmed cases ordeaths as observations we remove the variables corresponding to the disregarded observa-tions

Testing reporting and infection fatality rates Scaling all values of a time series by a con-stant does not change its growth rates The model is therefore invariant to the scale of theobservations and consequently to country-level differences in the IFR and the ascertainmentrate (the proportion of infected people who are subsequently reported positive) For exam-ple assume countries A and B differ only in their ascertainment rates Then our model willinfer a difference in N (C )

t c (Eq (A5)) but not in the growth rates g t c across A and B (Eq(A3)-(A4)) Accordingly the inferred NPI effectiveness will be identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas theintroduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimic changes in the ascertainment ratemdashnoise at time τ

affects all future casesmdashand allows for gradual multiplicative changes in the ascertainmentrate

We infer the unobserved variables in our model using Hamiltonian Monte-Carlo1516 (HMC)a standard MCMC sampling algorithm

Appendix A2 Technical Model Description

Variables are indexed by intervention i country c and day t All prior distributions areindependent

bull Data1 NPI Activations φi t c isin 012 Smoothed Observed Cases Ct c 3 Smoothed Observed Deaths D t c

bull Prior Distributions

cThis is only approximately true The negative binomial output distribution has a coefficient of variationdiminishing with its mean ie smaller observations are relatively more noisy and carry less weight Further-more whilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligibleeffect

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

1 Country-specific R0

R0c = Normal(R0κ) (A6)

R0 = 325 based on a meta analysis17 (A7)

κsim Half Normal(micro= 0σ= 05) (A8)

2 NPI Effectiveness

αi sim Normal(micro= 0σ=p02) (A9)

(A10)

3 Infection Initial Counts (uninformative priors)

N (C )0c = exp(ζ(C )

c ) (A11)

N (D)0c = exp(ζ(D)

c ) (A12)

ζ(C )c sim Normal(micro= 0σ= 50) (A13)

ζ(D)c sim Normal(micro= 0σ= 50) (A14)

(A15)

4 Observation Noise Dispersion Parameter

Ψsim Half Normal(micro= 0σ= 5) (A16)

bull Hyperparameters1 Infection Noise Scale σN = 02 (selected by cross-validation)

bull Epidemiological parameters1 Generation Interval Parameters The generation interval is assumed to have a

Gamma distribution with mean 667 and standard deviation 21 days4ndash6 This leadsto a distribution Gamma(α= 79β= 12)

2 Delay Distributions The time from infection to confirmation is assumed to bethe sum of the incubation period and the time taken from symptom onset to lab-oratory confirmation Therefore the time taken from infection to confirmationT (C ) is411ndash13

T (C ) sim Gamma(micro= 51σ

micro= 086)+Negative Binomial(micro= 525α= 157) (A17)

The time from infection to death is assumed to be the sum of the incubationperiod and the time taken from symptom onset to death Therefore the timetaken from infection to death T (D) is11114

T (D) sim Gamma(micro= 51σ

micro= 086)+Gamma(micro= 178

σ

micro= 045) (A18)

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 20: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Appendix

Table of ContentsAppendix A Modelling details 21

Appendix A1 Detailed model description 21

Appendix A2 Technical Model Description 24

Appendix B Validation 28

Appendix B1 Unseen data 28

Appendix B2 Sensitivity Analysis 28

Appendix B3 Robustness to unobserved effects 31

Appendix C Additional sensitivity analyses and validation 33

Appendix C1 Role of NPI timing 33

Appendix C2 Sensitivity to data preprocessing 34

Appendix C3 Sensitivity to additional epidemiological assumptions 35

Appendix C4 Additional country exclusions 37

Appendix C5 Validation using predictions in excluded countries 38

Appendix C6 MCMC stability results 43

Appendix C7 Posterior predictive distributions 44

Appendix D Additional results 45

Appendix D1 Estimated R0 by country 45

Appendix D2 The individual effects of school and university closures 46

Appendix D3 Collinearity 46

Appendix D4 Correlations between effectiveness estimates 47

Appendix E Additional discussion of assumptions and limitations 49

Appendix E1 Limitations of the data 49

Appendix E2 Model limitations 49

Appendix F Overview of previous work 50

Appendix G Handling edge cases in the data collection 53

Appendix H References 57

20

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure A6 Model Overview Purple nodes are observed or have a fixed distribution We describe thediagram from bottom to top Each NPIrsquos effectiveness is characterised by αi which is independent ofthe country On each day a countryrsquos daily reproduction number Rt c only depends on that countryrsquosbase reproduction number R0c and the active NPIs (Φi t c ) Rt c is transformed to the daily growth rategt c which is used to compute the new infections N (C )

t c and N (D)t c that will turn into confirmed cases

and deaths respectively Finally the number of new confirmed cases Ct c and deaths Dt c is computedby convolution of N (middot)

t c with the respective delay distributions The same model structure is used forconfirmed cases and deaths The model combines both observations it splits all nodes above the dailygrowth rate gt c into separate branches for deaths and cases

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1 Themain difference is that we model both confirmed cases and deaths allowing us to leveragesignificantly more data Furthermore we do not assume a specific infection fatality rate(IFR) since we do not aim to infer the total number of COVID-19 infections The end of thissection details further adaptations which allow us to minimize assumptions about testingreporting and the IFR A list of all technical details is given in Appendix A2

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior (and hyperprior) distribution over R0c reflecting the wide disagree-ment of regional estimates of R02 We parameterize the effectiveness of NPI i assumed to

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

be same across countries and time with αi Each NPI is assumed to have an independentmultiplicative as on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We use a weakly informative symmetric prior αi simN (002) allowingfor both positive and negative effects because we presently cannot rule out that some NPIsdirectly or indirectly increase transmission

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c andRt c 3

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We assume that the generationinterval distribution is given by a gamma distribution with mean 667 days and standarddeviation 21 The mean is based on an Italian study4 which is deemed most relevant toEuropean countries5 and the standard deviation stems from a international set of countriessince in European countries it has only been estimated for the serial rather than generationinterval to our knowledge67 Using (A2) we can write g t c as g t c (Rt c ) (see Appendix A2)

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are backwards-inferred from the observation models for casesand deaths shown further below We assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)](A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)](A4)

aMany epidemiological models define growth rates as the exponent r in an exponential growth functionHere we use daily growth rates instead for ease of exposition These choices are mathematically equivalentNote that we adapted equation (29) in Wallinga amp Lipsitch3 to account for our choice

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross validation as no reference is available for it We did not tune any otheraspect of the modelmdashinstead we use choices from Flaxman et al1 or the most relevantavailable sources We evaluate five different values (σN isin 00501020304) fitting themodel on 35 countries each time and evaluating on a fixed randomly chosen validationset of 6 countries We select σN = 02 to maximize the log-likelihood on the validation setCross validation ensures a more calibrated model less likely to produce overconfident andunstable estimates10 The final model with σN = 02 is then evaluated on data from 20 held-out days in all countries which were not used to adjust any aspect of the model (FigureB7) We find that different values for σN produce very similar effectiveness estimates butlarger values lead to greater uncertainty and robustness (Figure C12 and Sharma et al9)

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation This delaydistribution is the sum of two independent gamma distributions the incubation periodand the delay from onset of symptoms to confirmation We use previously published andagreeing empirical distributions from China and Italy411ndash13 which sum up to a mean delayof 1035 days Finally the observed cases Ct c follow a negative binomial noise distributionwith mean C t c and an inferred dispersion parameter following Flaxman et al1

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ)

where PD (delay) is the distribution of the delay from infection to death It is also the sumof two independent gamma distributions the aforementioned incubation period and thedelay from onset of symptoms to death1114 which sum up to a mean delay of 229 daysFinally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and the same inferred dispersion parameter used for observed cases

bSince we treat new infections as a continuous number its initial value can (and often should) be between0 and 1

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Single and combined models To construct models which only use either confirmed cases ordeaths as observations we remove the variables corresponding to the disregarded observa-tions

Testing reporting and infection fatality rates Scaling all values of a time series by a con-stant does not change its growth rates The model is therefore invariant to the scale of theobservations and consequently to country-level differences in the IFR and the ascertainmentrate (the proportion of infected people who are subsequently reported positive) For exam-ple assume countries A and B differ only in their ascertainment rates Then our model willinfer a difference in N (C )

t c (Eq (A5)) but not in the growth rates g t c across A and B (Eq(A3)-(A4)) Accordingly the inferred NPI effectiveness will be identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas theintroduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimic changes in the ascertainment ratemdashnoise at time τ

affects all future casesmdashand allows for gradual multiplicative changes in the ascertainmentrate

We infer the unobserved variables in our model using Hamiltonian Monte-Carlo1516 (HMC)a standard MCMC sampling algorithm

Appendix A2 Technical Model Description

Variables are indexed by intervention i country c and day t All prior distributions areindependent

bull Data1 NPI Activations φi t c isin 012 Smoothed Observed Cases Ct c 3 Smoothed Observed Deaths D t c

bull Prior Distributions

cThis is only approximately true The negative binomial output distribution has a coefficient of variationdiminishing with its mean ie smaller observations are relatively more noisy and carry less weight Further-more whilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligibleeffect

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

1 Country-specific R0

R0c = Normal(R0κ) (A6)

R0 = 325 based on a meta analysis17 (A7)

κsim Half Normal(micro= 0σ= 05) (A8)

2 NPI Effectiveness

αi sim Normal(micro= 0σ=p02) (A9)

(A10)

3 Infection Initial Counts (uninformative priors)

N (C )0c = exp(ζ(C )

c ) (A11)

N (D)0c = exp(ζ(D)

c ) (A12)

ζ(C )c sim Normal(micro= 0σ= 50) (A13)

ζ(D)c sim Normal(micro= 0σ= 50) (A14)

(A15)

4 Observation Noise Dispersion Parameter

Ψsim Half Normal(micro= 0σ= 5) (A16)

bull Hyperparameters1 Infection Noise Scale σN = 02 (selected by cross-validation)

bull Epidemiological parameters1 Generation Interval Parameters The generation interval is assumed to have a

Gamma distribution with mean 667 and standard deviation 21 days4ndash6 This leadsto a distribution Gamma(α= 79β= 12)

2 Delay Distributions The time from infection to confirmation is assumed to bethe sum of the incubation period and the time taken from symptom onset to lab-oratory confirmation Therefore the time taken from infection to confirmationT (C ) is411ndash13

T (C ) sim Gamma(micro= 51σ

micro= 086)+Negative Binomial(micro= 525α= 157) (A17)

The time from infection to death is assumed to be the sum of the incubationperiod and the time taken from symptom onset to death Therefore the timetaken from infection to death T (D) is11114

T (D) sim Gamma(micro= 51σ

micro= 086)+Gamma(micro= 178

σ

micro= 045) (A18)

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 21: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Appendix A Modelling details

Appendix A1 Detailed model description

For each country c

For each day t

New infections

( noise)

N (sdot)tcsdot

Daily reproduction number Rtc

New confirmed cases or deaths Ctc Dtc

Basic reproduction number R0c

Delay from infection to confirmation death

Mask wearing

Symptomatic testing

Gatherings limited to

10

Gatherings limited to

100

Some businesses

closed

Gatherings limited to

1000

Many businesses

closed

Schools and universities

closed

Stay-home order

Growth reductions from interventions αi i

Daily growth rate gtcGeneration interval

Product of active reductions

= 1 if is onϕitc i

For each intervention i

Figure A6 Model Overview Purple nodes are observed or have a fixed distribution We describe thediagram from bottom to top Each NPIrsquos effectiveness is characterised by αi which is independent ofthe country On each day a countryrsquos daily reproduction number Rt c only depends on that countryrsquosbase reproduction number R0c and the active NPIs (Φi t c ) Rt c is transformed to the daily growth rategt c which is used to compute the new infections N (C )

t c and N (D)t c that will turn into confirmed cases

and deaths respectively Finally the number of new confirmed cases Ct c and deaths Dt c is computedby convolution of N (middot)

t c with the respective delay distributions The same model structure is used forconfirmed cases and deaths The model combines both observations it splits all nodes above the dailygrowth rate gt c into separate branches for deaths and cases

We construct a semi-mechanistic Bayesian hierarchical model similar to Flaxman et al1 Themain difference is that we model both confirmed cases and deaths allowing us to leveragesignificantly more data Furthermore we do not assume a specific infection fatality rate(IFR) since we do not aim to infer the total number of COVID-19 infections The end of thissection details further adaptations which allow us to minimize assumptions about testingreporting and the IFR A list of all technical details is given in Appendix A2

We describe the model in Figure A6 from bottom to top The epidemicrsquos growth is deter-mined by the time-and-country-specific (instantaneous) reproduction number Rt c It de-pends on a) the basic reproduction number R0c without any NPIs active and b) the activeNPIs We place a prior (and hyperprior) distribution over R0c reflecting the wide disagree-ment of regional estimates of R02 We parameterize the effectiveness of NPI i assumed to

21

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

be same across countries and time with αi Each NPI is assumed to have an independentmultiplicative as on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We use a weakly informative symmetric prior αi simN (002) allowingfor both positive and negative effects because we presently cannot rule out that some NPIsdirectly or indirectly increase transmission

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c andRt c 3

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We assume that the generationinterval distribution is given by a gamma distribution with mean 667 days and standarddeviation 21 The mean is based on an Italian study4 which is deemed most relevant toEuropean countries5 and the standard deviation stems from a international set of countriessince in European countries it has only been estimated for the serial rather than generationinterval to our knowledge67 Using (A2) we can write g t c as g t c (Rt c ) (see Appendix A2)

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are backwards-inferred from the observation models for casesand deaths shown further below We assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)](A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)](A4)

aMany epidemiological models define growth rates as the exponent r in an exponential growth functionHere we use daily growth rates instead for ease of exposition These choices are mathematically equivalentNote that we adapted equation (29) in Wallinga amp Lipsitch3 to account for our choice

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross validation as no reference is available for it We did not tune any otheraspect of the modelmdashinstead we use choices from Flaxman et al1 or the most relevantavailable sources We evaluate five different values (σN isin 00501020304) fitting themodel on 35 countries each time and evaluating on a fixed randomly chosen validationset of 6 countries We select σN = 02 to maximize the log-likelihood on the validation setCross validation ensures a more calibrated model less likely to produce overconfident andunstable estimates10 The final model with σN = 02 is then evaluated on data from 20 held-out days in all countries which were not used to adjust any aspect of the model (FigureB7) We find that different values for σN produce very similar effectiveness estimates butlarger values lead to greater uncertainty and robustness (Figure C12 and Sharma et al9)

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation This delaydistribution is the sum of two independent gamma distributions the incubation periodand the delay from onset of symptoms to confirmation We use previously published andagreeing empirical distributions from China and Italy411ndash13 which sum up to a mean delayof 1035 days Finally the observed cases Ct c follow a negative binomial noise distributionwith mean C t c and an inferred dispersion parameter following Flaxman et al1

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ)

where PD (delay) is the distribution of the delay from infection to death It is also the sumof two independent gamma distributions the aforementioned incubation period and thedelay from onset of symptoms to death1114 which sum up to a mean delay of 229 daysFinally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and the same inferred dispersion parameter used for observed cases

bSince we treat new infections as a continuous number its initial value can (and often should) be between0 and 1

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Single and combined models To construct models which only use either confirmed cases ordeaths as observations we remove the variables corresponding to the disregarded observa-tions

Testing reporting and infection fatality rates Scaling all values of a time series by a con-stant does not change its growth rates The model is therefore invariant to the scale of theobservations and consequently to country-level differences in the IFR and the ascertainmentrate (the proportion of infected people who are subsequently reported positive) For exam-ple assume countries A and B differ only in their ascertainment rates Then our model willinfer a difference in N (C )

t c (Eq (A5)) but not in the growth rates g t c across A and B (Eq(A3)-(A4)) Accordingly the inferred NPI effectiveness will be identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas theintroduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimic changes in the ascertainment ratemdashnoise at time τ

affects all future casesmdashand allows for gradual multiplicative changes in the ascertainmentrate

We infer the unobserved variables in our model using Hamiltonian Monte-Carlo1516 (HMC)a standard MCMC sampling algorithm

Appendix A2 Technical Model Description

Variables are indexed by intervention i country c and day t All prior distributions areindependent

bull Data1 NPI Activations φi t c isin 012 Smoothed Observed Cases Ct c 3 Smoothed Observed Deaths D t c

bull Prior Distributions

cThis is only approximately true The negative binomial output distribution has a coefficient of variationdiminishing with its mean ie smaller observations are relatively more noisy and carry less weight Further-more whilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligibleeffect

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

1 Country-specific R0

R0c = Normal(R0κ) (A6)

R0 = 325 based on a meta analysis17 (A7)

κsim Half Normal(micro= 0σ= 05) (A8)

2 NPI Effectiveness

αi sim Normal(micro= 0σ=p02) (A9)

(A10)

3 Infection Initial Counts (uninformative priors)

N (C )0c = exp(ζ(C )

c ) (A11)

N (D)0c = exp(ζ(D)

c ) (A12)

ζ(C )c sim Normal(micro= 0σ= 50) (A13)

ζ(D)c sim Normal(micro= 0σ= 50) (A14)

(A15)

4 Observation Noise Dispersion Parameter

Ψsim Half Normal(micro= 0σ= 5) (A16)

bull Hyperparameters1 Infection Noise Scale σN = 02 (selected by cross-validation)

bull Epidemiological parameters1 Generation Interval Parameters The generation interval is assumed to have a

Gamma distribution with mean 667 and standard deviation 21 days4ndash6 This leadsto a distribution Gamma(α= 79β= 12)

2 Delay Distributions The time from infection to confirmation is assumed to bethe sum of the incubation period and the time taken from symptom onset to lab-oratory confirmation Therefore the time taken from infection to confirmationT (C ) is411ndash13

T (C ) sim Gamma(micro= 51σ

micro= 086)+Negative Binomial(micro= 525α= 157) (A17)

The time from infection to death is assumed to be the sum of the incubationperiod and the time taken from symptom onset to death Therefore the timetaken from infection to death T (D) is11114

T (D) sim Gamma(micro= 51σ

micro= 086)+Gamma(micro= 178

σ

micro= 045) (A18)

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 22: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

be same across countries and time with αi Each NPI is assumed to have an independentmultiplicative as on Rt c as follows

Rt c = R0c

Iprodi=1

exp(minusαi φi t c

) (A1)

where φi ct = 1 means NPI i is active in country c on day t (φi ct = 0 otherwise) and I isthe number of NPIs We use a weakly informative symmetric prior αi simN (002) allowingfor both positive and negative effects because we presently cannot rule out that some NPIsdirectly or indirectly increase transmission

Growth rates Nt c denotes the number of new infections at time t and country c In theearly phase of an epidemic Nt c grows exponentially with a dailya growth rate g t c Duringexponential growth there is a well-known one-to-one correspondence between g t c andRt c 3

Rt c = 1

M(minus log(1+ g t c )) (A2)

where M(middot) is the moment-generating function of the distribution of the generation interval(the time between successive cases in a transmission chain) We assume that the generationinterval distribution is given by a gamma distribution with mean 667 days and standarddeviation 21 The mean is based on an Italian study4 which is deemed most relevant toEuropean countries5 and the standard deviation stems from a international set of countriessince in European countries it has only been estimated for the serial rather than generationinterval to our knowledge67 Using (A2) we can write g t c as g t c (Rt c ) (see Appendix A2)

Infection model Rather than modelling the total number of new infections Nt c we modelnew infections that will either be subsequently a) confirmed positive N (C )

t c or b) lead to areported death N (D)

t c These are backwards-inferred from the observation models for casesand deaths shown further below We assume that both grow at the same expected rate g t c

N (C )t c = N (C )

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(C )τc

)](A3)

N (D)t c = N (D)

0c

tprodτ=1

[(1+ gτc ) middotexp

(ε(D)τc

)](A4)

aMany epidemiological models define growth rates as the exponent r in an exponential growth functionHere we use daily growth rates instead for ease of exposition These choices are mathematically equivalentNote that we adapted equation (29) in Wallinga amp Lipsitch3 to account for our choice

22

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross validation as no reference is available for it We did not tune any otheraspect of the modelmdashinstead we use choices from Flaxman et al1 or the most relevantavailable sources We evaluate five different values (σN isin 00501020304) fitting themodel on 35 countries each time and evaluating on a fixed randomly chosen validationset of 6 countries We select σN = 02 to maximize the log-likelihood on the validation setCross validation ensures a more calibrated model less likely to produce overconfident andunstable estimates10 The final model with σN = 02 is then evaluated on data from 20 held-out days in all countries which were not used to adjust any aspect of the model (FigureB7) We find that different values for σN produce very similar effectiveness estimates butlarger values lead to greater uncertainty and robustness (Figure C12 and Sharma et al9)

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation This delaydistribution is the sum of two independent gamma distributions the incubation periodand the delay from onset of symptoms to confirmation We use previously published andagreeing empirical distributions from China and Italy411ndash13 which sum up to a mean delayof 1035 days Finally the observed cases Ct c follow a negative binomial noise distributionwith mean C t c and an inferred dispersion parameter following Flaxman et al1

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ)

where PD (delay) is the distribution of the delay from infection to death It is also the sumof two independent gamma distributions the aforementioned incubation period and thedelay from onset of symptoms to death1114 which sum up to a mean delay of 229 daysFinally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and the same inferred dispersion parameter used for observed cases

bSince we treat new infections as a continuous number its initial value can (and often should) be between0 and 1

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Single and combined models To construct models which only use either confirmed cases ordeaths as observations we remove the variables corresponding to the disregarded observa-tions

Testing reporting and infection fatality rates Scaling all values of a time series by a con-stant does not change its growth rates The model is therefore invariant to the scale of theobservations and consequently to country-level differences in the IFR and the ascertainmentrate (the proportion of infected people who are subsequently reported positive) For exam-ple assume countries A and B differ only in their ascertainment rates Then our model willinfer a difference in N (C )

t c (Eq (A5)) but not in the growth rates g t c across A and B (Eq(A3)-(A4)) Accordingly the inferred NPI effectiveness will be identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas theintroduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimic changes in the ascertainment ratemdashnoise at time τ

affects all future casesmdashand allows for gradual multiplicative changes in the ascertainmentrate

We infer the unobserved variables in our model using Hamiltonian Monte-Carlo1516 (HMC)a standard MCMC sampling algorithm

Appendix A2 Technical Model Description

Variables are indexed by intervention i country c and day t All prior distributions areindependent

bull Data1 NPI Activations φi t c isin 012 Smoothed Observed Cases Ct c 3 Smoothed Observed Deaths D t c

bull Prior Distributions

cThis is only approximately true The negative binomial output distribution has a coefficient of variationdiminishing with its mean ie smaller observations are relatively more noisy and carry less weight Further-more whilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligibleeffect

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

1 Country-specific R0

R0c = Normal(R0κ) (A6)

R0 = 325 based on a meta analysis17 (A7)

κsim Half Normal(micro= 0σ= 05) (A8)

2 NPI Effectiveness

αi sim Normal(micro= 0σ=p02) (A9)

(A10)

3 Infection Initial Counts (uninformative priors)

N (C )0c = exp(ζ(C )

c ) (A11)

N (D)0c = exp(ζ(D)

c ) (A12)

ζ(C )c sim Normal(micro= 0σ= 50) (A13)

ζ(D)c sim Normal(micro= 0σ= 50) (A14)

(A15)

4 Observation Noise Dispersion Parameter

Ψsim Half Normal(micro= 0σ= 5) (A16)

bull Hyperparameters1 Infection Noise Scale σN = 02 (selected by cross-validation)

bull Epidemiological parameters1 Generation Interval Parameters The generation interval is assumed to have a

Gamma distribution with mean 667 and standard deviation 21 days4ndash6 This leadsto a distribution Gamma(α= 79β= 12)

2 Delay Distributions The time from infection to confirmation is assumed to bethe sum of the incubation period and the time taken from symptom onset to lab-oratory confirmation Therefore the time taken from infection to confirmationT (C ) is411ndash13

T (C ) sim Gamma(micro= 51σ

micro= 086)+Negative Binomial(micro= 525α= 157) (A17)

The time from infection to death is assumed to be the sum of the incubationperiod and the time taken from symptom onset to death Therefore the timetaken from infection to death T (D) is11114

T (D) sim Gamma(micro= 51σ

micro= 086)+Gamma(micro= 178

σ

micro= 045) (A18)

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 23: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

where ε(middot)τc sim N (0σN = 02) are separate independent noise terms Noise on the infection

numbers is not used by Flaxman et al1 but has a history in epidemic modelling8 Empiri-cally we find that it leads to substantially more robust effectiveness estimates9

We select σN by cross validation as no reference is available for it We did not tune any otheraspect of the modelmdashinstead we use choices from Flaxman et al1 or the most relevantavailable sources We evaluate five different values (σN isin 00501020304) fitting themodel on 35 countries each time and evaluating on a fixed randomly chosen validationset of 6 countries We select σN = 02 to maximize the log-likelihood on the validation setCross validation ensures a more calibrated model less likely to produce overconfident andunstable estimates10 The final model with σN = 02 is then evaluated on data from 20 held-out days in all countries which were not used to adjust any aspect of the model (FigureB7) We find that different values for σN produce very similar effectiveness estimates butlarger values lead to greater uncertainty and robustness (Figure C12 and Sharma et al9)

We seed our model with unobserved initial values N (C )0c and N (D)

0c which have uninformativepriorsb

Observation model for confirmed cases The mean predicted number of new confirmed casesis a discrete convolution

C t c =tsum

τ=1N (C )

tminusτc PC (delay= τ) (A5)

where PC (delay) is the distribution of the delay from infection to confirmation This delaydistribution is the sum of two independent gamma distributions the incubation periodand the delay from onset of symptoms to confirmation We use previously published andagreeing empirical distributions from China and Italy411ndash13 which sum up to a mean delayof 1035 days Finally the observed cases Ct c follow a negative binomial noise distributionwith mean C t c and an inferred dispersion parameter following Flaxman et al1

Observation model for deaths The mean predicted number of new deaths is a discreteconvolution

D t c =tsum

τ=1N (D)

tminusτc PD (delay= τ)

where PD (delay) is the distribution of the delay from infection to death It is also the sumof two independent gamma distributions the aforementioned incubation period and thedelay from onset of symptoms to death1114 which sum up to a mean delay of 229 daysFinally the observed deaths D t c also follow a negative binomial distribution with mean D t c

and the same inferred dispersion parameter used for observed cases

bSince we treat new infections as a continuous number its initial value can (and often should) be between0 and 1

23

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Single and combined models To construct models which only use either confirmed cases ordeaths as observations we remove the variables corresponding to the disregarded observa-tions

Testing reporting and infection fatality rates Scaling all values of a time series by a con-stant does not change its growth rates The model is therefore invariant to the scale of theobservations and consequently to country-level differences in the IFR and the ascertainmentrate (the proportion of infected people who are subsequently reported positive) For exam-ple assume countries A and B differ only in their ascertainment rates Then our model willinfer a difference in N (C )

t c (Eq (A5)) but not in the growth rates g t c across A and B (Eq(A3)-(A4)) Accordingly the inferred NPI effectiveness will be identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas theintroduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimic changes in the ascertainment ratemdashnoise at time τ

affects all future casesmdashand allows for gradual multiplicative changes in the ascertainmentrate

We infer the unobserved variables in our model using Hamiltonian Monte-Carlo1516 (HMC)a standard MCMC sampling algorithm

Appendix A2 Technical Model Description

Variables are indexed by intervention i country c and day t All prior distributions areindependent

bull Data1 NPI Activations φi t c isin 012 Smoothed Observed Cases Ct c 3 Smoothed Observed Deaths D t c

bull Prior Distributions

cThis is only approximately true The negative binomial output distribution has a coefficient of variationdiminishing with its mean ie smaller observations are relatively more noisy and carry less weight Further-more whilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligibleeffect

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

1 Country-specific R0

R0c = Normal(R0κ) (A6)

R0 = 325 based on a meta analysis17 (A7)

κsim Half Normal(micro= 0σ= 05) (A8)

2 NPI Effectiveness

αi sim Normal(micro= 0σ=p02) (A9)

(A10)

3 Infection Initial Counts (uninformative priors)

N (C )0c = exp(ζ(C )

c ) (A11)

N (D)0c = exp(ζ(D)

c ) (A12)

ζ(C )c sim Normal(micro= 0σ= 50) (A13)

ζ(D)c sim Normal(micro= 0σ= 50) (A14)

(A15)

4 Observation Noise Dispersion Parameter

Ψsim Half Normal(micro= 0σ= 5) (A16)

bull Hyperparameters1 Infection Noise Scale σN = 02 (selected by cross-validation)

bull Epidemiological parameters1 Generation Interval Parameters The generation interval is assumed to have a

Gamma distribution with mean 667 and standard deviation 21 days4ndash6 This leadsto a distribution Gamma(α= 79β= 12)

2 Delay Distributions The time from infection to confirmation is assumed to bethe sum of the incubation period and the time taken from symptom onset to lab-oratory confirmation Therefore the time taken from infection to confirmationT (C ) is411ndash13

T (C ) sim Gamma(micro= 51σ

micro= 086)+Negative Binomial(micro= 525α= 157) (A17)

The time from infection to death is assumed to be the sum of the incubationperiod and the time taken from symptom onset to death Therefore the timetaken from infection to death T (D) is11114

T (D) sim Gamma(micro= 51σ

micro= 086)+Gamma(micro= 178

σ

micro= 045) (A18)

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 24: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Single and combined models To construct models which only use either confirmed cases ordeaths as observations we remove the variables corresponding to the disregarded observa-tions

Testing reporting and infection fatality rates Scaling all values of a time series by a con-stant does not change its growth rates The model is therefore invariant to the scale of theobservations and consequently to country-level differences in the IFR and the ascertainmentrate (the proportion of infected people who are subsequently reported positive) For exam-ple assume countries A and B differ only in their ascertainment rates Then our model willinfer a difference in N (C )

t c (Eq (A5)) but not in the growth rates g t c across A and B (Eq(A3)-(A4)) Accordingly the inferred NPI effectiveness will be identicalc

In reality a countryrsquos ascertainment rate (and IFR) can also change over time In principleit is possible to distinguish changes in the ascertainment rate from the NPIsrsquo effects de-creasing the ascertainment rate decreases future cases Ct c by a constant factor whereas theintroduction of an NPI decreases them by a factor that grows exponentially over timed Thenoise term exp

(ε(C )τc

)(Eq (A3)) mimic changes in the ascertainment ratemdashnoise at time τ

affects all future casesmdashand allows for gradual multiplicative changes in the ascertainmentrate

We infer the unobserved variables in our model using Hamiltonian Monte-Carlo1516 (HMC)a standard MCMC sampling algorithm

Appendix A2 Technical Model Description

Variables are indexed by intervention i country c and day t All prior distributions areindependent

bull Data1 NPI Activations φi t c isin 012 Smoothed Observed Cases Ct c 3 Smoothed Observed Deaths D t c

bull Prior Distributions

cThis is only approximately true The negative binomial output distribution has a coefficient of variationdiminishing with its mean ie smaller observations are relatively more noisy and carry less weight Further-more whilst the prior over N (C )

0c could break scale invariance the uninformative prior results in a negligibleeffect

dHowever our model may struggle when the ascertainment rate also changes exponentially over time Thiscould happen when a country reaches its testing capacity See Appendix E

24

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

1 Country-specific R0

R0c = Normal(R0κ) (A6)

R0 = 325 based on a meta analysis17 (A7)

κsim Half Normal(micro= 0σ= 05) (A8)

2 NPI Effectiveness

αi sim Normal(micro= 0σ=p02) (A9)

(A10)

3 Infection Initial Counts (uninformative priors)

N (C )0c = exp(ζ(C )

c ) (A11)

N (D)0c = exp(ζ(D)

c ) (A12)

ζ(C )c sim Normal(micro= 0σ= 50) (A13)

ζ(D)c sim Normal(micro= 0σ= 50) (A14)

(A15)

4 Observation Noise Dispersion Parameter

Ψsim Half Normal(micro= 0σ= 5) (A16)

bull Hyperparameters1 Infection Noise Scale σN = 02 (selected by cross-validation)

bull Epidemiological parameters1 Generation Interval Parameters The generation interval is assumed to have a

Gamma distribution with mean 667 and standard deviation 21 days4ndash6 This leadsto a distribution Gamma(α= 79β= 12)

2 Delay Distributions The time from infection to confirmation is assumed to bethe sum of the incubation period and the time taken from symptom onset to lab-oratory confirmation Therefore the time taken from infection to confirmationT (C ) is411ndash13

T (C ) sim Gamma(micro= 51σ

micro= 086)+Negative Binomial(micro= 525α= 157) (A17)

The time from infection to death is assumed to be the sum of the incubationperiod and the time taken from symptom onset to death Therefore the timetaken from infection to death T (D) is11114

T (D) sim Gamma(micro= 51σ

micro= 086)+Gamma(micro= 178

σ

micro= 045) (A18)

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 25: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

1 Country-specific R0

R0c = Normal(R0κ) (A6)

R0 = 325 based on a meta analysis17 (A7)

κsim Half Normal(micro= 0σ= 05) (A8)

2 NPI Effectiveness

αi sim Normal(micro= 0σ=p02) (A9)

(A10)

3 Infection Initial Counts (uninformative priors)

N (C )0c = exp(ζ(C )

c ) (A11)

N (D)0c = exp(ζ(D)

c ) (A12)

ζ(C )c sim Normal(micro= 0σ= 50) (A13)

ζ(D)c sim Normal(micro= 0σ= 50) (A14)

(A15)

4 Observation Noise Dispersion Parameter

Ψsim Half Normal(micro= 0σ= 5) (A16)

bull Hyperparameters1 Infection Noise Scale σN = 02 (selected by cross-validation)

bull Epidemiological parameters1 Generation Interval Parameters The generation interval is assumed to have a

Gamma distribution with mean 667 and standard deviation 21 days4ndash6 This leadsto a distribution Gamma(α= 79β= 12)

2 Delay Distributions The time from infection to confirmation is assumed to bethe sum of the incubation period and the time taken from symptom onset to lab-oratory confirmation Therefore the time taken from infection to confirmationT (C ) is411ndash13

T (C ) sim Gamma(micro= 51σ

micro= 086)+Negative Binomial(micro= 525α= 157) (A17)

The time from infection to death is assumed to be the sum of the incubationperiod and the time taken from symptom onset to death Therefore the timetaken from infection to death T (D) is11114

T (D) sim Gamma(micro= 51σ

micro= 086)+Gamma(micro= 178

σ

micro= 045) (A18)

25

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 26: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

where α is known as the dispersion parameter Caution larger values of αcorrespond to a smaller variance and less dispersion With our parameterisationthe variance of the Negative Binomial distribution is micro+ micro2

α For computational efficiency we discretise this distribution using Monte Carlosampling We therefore form discrete arrays πC[i ] and πD[i ] where the value ofπC[i ] corresponds to the probability of the delay being i days We truncate πC

to a maximum delay of 31 days and πD to a maximum delay of 63 days

bull Infection Model1 Rt c = R0c middotexp

(minussum9i=1αi φi t c

)

2 g t c = exp

(β(R

ct minus1)

)minus 1 where α and β are the parameters of the generation

interval distribution This is the exact conversion under exponential growth fol-lowing eq (29) in Wallinga amp Lipsitch3 (Note that we use daily growth rates)

3

N (C )t c = N (C )

0c

tprodτ=1

[(gτc +1) middotexpε(C )

τc

] (A19)

N (D)t c = N (D)

0c

tprodτ=1

[(gτc +1) middotexpε(D)

τc )]

with noise (A20)

ε(C )τc sim Normal(micro= 0σ=σN ) (A21)

ε(D)τc sim Normal(micro= 0σ=σN ) (A22)

N (C )t c represents the number of daily new infections at time t in country c who

will eventually be tested positive (N (D)t c similar but for infections who will pass

away)bull Observation Model We use discrete convolutions to produce the expected number

of new cases and deaths on a given day

Ct c =32sumτ=1

N (C )tminusτcπC [τ] (A23)

D t c =64sumτ=1

N (D)tminusτcπD [τ] (A24)

Finally the output distribution follows a Negative Binomial noise distribution as pro-posed by Flaxman et al1

Ct c sim Negative Binomial(micro= Ct c α=Ψ) (A25)

D t c sim Negative Binomial(micro= D t c α=Ψ) (A26)

α is the dispersion parameter of the distribution Caution larger values of α cor-respond to a smaller variance and less dispersion With our parameterisation thevariance of the Negative Binomial distribution is micro+ micro2

α so that smaller observationsare relatively more noisy

26

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 27: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

This model was implemented in PyMC318 with the NUTS MCMC sampling algorithm16

27

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 28: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Appendix B Validation

Appendix B1 Unseen data

An important way to validate a Bayesian model is by checking how well it predicts unseendata even if prediction is not the purpose of the model1019 If an NPI effectiveness modelis entirely unable to extrapolate to unseen countries or to future unseen confirmed casesand deaths we have strong reason to doubt its effectiveness estimates However we donot expect NPI effectiveness models to extrapolate perfectly Almost always there will beunobserved factors affecting the observed number of cases and deaths such as changesin the ascertainment rate or IFR spontaneous behaviour changes and unobserved NPIsOur models ought to treat these factors as noise and not attribute their effects on R to theobserved NPIs

We fit our model while holding out the last 20 days of cases and deaths for all countriesand then extrapolate to the last 20 days (Figure B7 top left) A 20-day prediction ischallenging the longest attempted holdout period we found in data-driven NPI models was3 days1 and most other related work does not validate predictions on unseen data at all9

The model is well-calibrated with most points falling within the 95 credible intervalsThe model predicts a higher number than reported exactly twice as often as predicting alower number This suggests that unobserved factors have reduced R below what would bepredicted based on the active NPIs alone We would indeed expect most countries to havefewer cases and deaths than predicted solely from the eight NPIs in our model There areseveral other NPIs as well as further unobserved behaviour changes that we do not modelbut that likely reduce R on average The result suggests that these factors are at least to acertain extent successfully treated as noise instead of confounding the effects of NPIs

However note that the model shown in Figure B7 (top left) was fitted on 20 days less ofdata (per country) than the main model The predictions can thus only serve for modelvalidation insofar as we expect the model fitted on all days to have similar or better extrap-olation to unseen data as the model fitted on all but the last 20 days In further validationexperiments we analyse how the model extrapolates to individual countries left-out duringfitting and again find that it makes well-calibrated predictions (Appendix C5)

Appendix B2 Sensitivity Analysis

Sensitivity analysis reveals which results depend on uncertain parameters and modellingchoices and can diagnose model misspecification and excessive collinearity in the data20

We vary many of the components of our model and recompute the NPI effectiveness esti-mates summarised here Further analysis in Appendix C

Sensitivity to epidemiological parameters The epidemiological parameters in our modelare the delay from infection to reporting the delay from infection to death and the gener-ation interval Furthermore we specify a prior distribution over NPI effectiveness In Fig-

28

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 29: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

ure B7 (top right) we consider several alternative values for these parameters Consistentwith Flaxman et al1 and theoretical expectation3 we find that a shorter mean generationinterval implies a smaller initial R0 and therefore lower effectiveness estimates on averageHowever the estimate for banning large gatherings increases partly a consequence of in-cluding an unrealistically13 short generation interval of 4 days Restricting the prior to onlyallow NPIs to reduce but not increase R has no significant impact on the estimates (rsquoHalfNormalrsquo) Using an uninformative prior (log-normal with micro = 1 and σ = 10 rsquoWidersquo) ampli-fies differences between NPIs suggesting that our default prior is informative Using theprior of Flaxman et al increases the differences between NPIs an outcome which this priorencourages We do not use this prior in our main analysis because it is designed to makerealistic assumptions about the joint effect of all NPIs not about their individual effects

0 1 10 102 103 104

Reported

0

1

10

102

103

104

Pre

dict

ed

20 Day Extrapolation

CasesDeaths

-50 0 50 100

Country SensitivityDEHUFIIERSBE

-50 0 50 100

Country SensitivityDKGRNOFRROMA

-50 0 50 100

Country SensitivityESCZNLCHPTAT

-50 0 50 100

Country SensitivityILSEITMXGBPL

-50 0 50 100

Death Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Confirmed Delay4 days

2 days

+2 days+4 daysDefault

-50 0 50 100

Effectiveness PriorHalf NormalFlaxman et alWideDefault

-50 0 50 100

Generation Interval= 4= 5= 6= 7= 8= 667

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Data SourceConfirmed CasesConfirmed DeathsDefault

Figure B7 Results validation Top left We fit our model while holding out the last 20 days of cases anddeaths for all countries The figure shows the extrapolation to the last 20 days Each dot represents pre-dicted cases or deaths in one country 95 sampled credible intervals shown Observed cases and deathsare smoothed (see Methods) Others NPI effectiveness estimates when epidemiological parameters ordata are varied Median 50 and 95 credible intervals of the marginal posterior distribution of the ef-fectiveness parameters are shown Top right Sensitivity to epidemiological parameter choices Changesin the mean generation interval delay distributions between infection and case confirmationdeath andthe prior on NPI effectiveness Bottom left Sensitivity to removing one country at a time from the dataBottom right Sensitivity to different data sources using only confirmed cases only deaths or both

29

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 30: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Sensitivity to data Figure B7 (bottom right) shows the NPI effectiveness estimates frommodels that use only cases or deaths as observations in contrast to our main model whichuses both Reassuringly the three models have similar results This suggests that resultsare not biased by factors specific to deaths or confirmed cases such as changes in theascertainment rate IFR and model-specific time delays Figure B7 (bottom left) showsresults if one country at a time is excluded from the data As there is no strong justificationfor in- or excluding one particular country results ought to be stable if a country is excludedThis is indeed the case All countries are shown in Appendix C

Sensitivity to structurally different models A number of implicit structural assumptionsare made in our model We test sensitivity to these assumptions by evaluating NPI effective-ness estimates from alternative models reproducing the structural sensitivity analysis fromour concurrent work where these models are described in detail9

As Figure B8 shows all models support the conclusions we draw in the Discussion Themodels are

1 Different Effects Model The effectiveness of each NPI is allowed to vary across coun-tries

2 Discrete Renewal Model Instead of converting R into a daily growth rate a renewalprocess is used as the infection model as in a number of earlier works1821ndash23

3 Noisy-R Model The noise terms ε(middot)ct affect R rather than the growth rate as in Fraser8

4 Additive Effects Model Each NPI has an additive effect on R The joint effectivenessof a set of NPIs is produced by summing rather than multiplying their individualeffectiveness estimates

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or less

Some businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Structural SensitivityAdditiveNoisy-RDifferent EffectsDiscrete RenewalDefault

Figure B8 Structural sensitivity analysis Effectiveness estimates under different structural assump-tions Note that the additive model (blue) cannot be quantitatively compared to others (see text)

The results of the additive model (blue) cannot be directly compared to the other modelssince they are not expressed as percentage reductions in R but in R0 Its estimates aretherefore smaller (but support the same conclusions)

30

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 31: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

-50 -25 0 25 50 75 100

Mask-wearing mandatory in(some) public spacesGatherings limited to 1000people or lessGatherings limited to 100people or lessGatherings limited to 10people or lessSome businesses closed

Most businesses closed

Schools and universities closed

Stay-at-home order(with exemptions)

Inclusion of OxCGRT NPIsTravel ScreeningBansPublic Transport LimitedInternal Movement LimitedPublic Information CampaignsSymptomatic TestingDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Exclusion of Collected NPIsMask WearingGatherings lt1000Gatherings lt100Gatherings lt10Some Businesses SuspendedMost Businesses SuspendedSchool ClosureUniversity ClosureStay at Home OrderSchool and University Closure

Figure B9 Robustness to unobserved effects Left Results when controlling for previously unobservedNPIs We include one additional NPI in turn and show the estimates for the NPIs in our study (the ad-ditional NPI is not shown) Right Results when excluding previously observed NPIs We exclude oneof the NPIs in turn and show the estimates for the other NPIs Both Note that this figure shows theadditional effect of each NPI In other figures we show the cumulative effects for gathering bans andbusinesses closures denoted by showing multiple symbols (as explained in the caption to Figure 3)For example Figure 3 displays the total effect of closing most nonessential businesses while here weshow the additional effect of closing most nonessential businesses over just closing some high-risk busi-ness We show the additional effects here because the effect of a cumulative intervention would becomeundefined when part of it is excluded from the analysis

Appendix B3 Robustness to unobserved effects

Our data neither captures all NPIs implemented nor directly measures broader behaviouralchanges Since these factors influence R we must be wary of their effect being attributed toobserved NPIs We investigate this further by assessing how much effectiveness estimateschange when previously unobserved factors are included and also when observed factorsare excluded This is best practice for assessing robustness to unobserved factors2425 Wealso perform several additional investigations outlined in Appendix C1

Unobserved factors can bias results if their timing is correlated with the timing of the ob-served NPIs26 The timing of our observed NPIsrsquo implementation dates is indeed correlatedprompting the question how much excluding observed NPIs changes results Figure B9(right) shows NPI effectiveness estimates when previously observed NPIs are excluded inturn Estimates are robust with all 50 credible intervals mutually overlapping Consider-ing that some of the excluded NPIs have strong estimated effects when included and arecorrelated with other NPIs this degree of robustness is surprisingly high It suggests thatunobserved factors will not significantly bias results as long as their effects and their cor-relations with the studied NPIs do not exceed those of the studied NPIs We hypothesizethat this robustness to unobserved factors is due to the noise on infection numbers in ourmodel9

In addition Figure B9 (left) shows NPI effectiveness estimates when we include (ie con-trol for) additional NPIs taken from the OxCGRT dataset27

31

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 32: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Our conclusions are not affected by controlling for these NPIs This further suggests thatunobserved factors are successfully treated as noise

32

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 33: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Appendix C Additional sensitivity analyses and validation

Note The individual effect of school and university closures cannot be meaningfully dis-entangled as discussed in Appendix D2 For completeness we still show the individualestimates for school and university closures at the bottom of each graph in this section

Appendix C1 Role of NPI timing

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and universitiesclosed

Stay-at-home order

Schools closed

Universities closed

NPI Timing 1Any ActiveBonus NPIDifferent ConfirmationDelaysSchools andUniversities DelayedDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

NPI 1

NPI 2

NPI 3

NPI 4

NPI 5

NPI 6

NPI 7

NPI 8

NPI 9

NPI Timing 2

Figure C10 Relationship between NPI effectiveness and timing Left 1) Effect of delaying school anduniversity closures by 6 days in the data 2) increasing confirmation delay by 2 days in countries withoutextensive testing and decreasing by 2 days in countries with testing and 3) controlling for the start ofgovernment intervention (ie any NPI active) Right Effectiveness for early and late NPIs The modelestimates the effect of all first second etc NPIs

There are several reasons to investigate the relationship between NPI timing and effective-ness First as previously discussed in Appendix B unobserved factors such as behaviorchange may typically happen when the first NPIs are introduced and could confound theireffects This could lead to the surprisingly high estimates for school and university closuressince these NPIs were often implemented early Second as discussed in the Results sec-tion NPIs effectiveness may depend on the presence of other NPIs and fewer NPIs will bepresent earlier which could reduce the additional effect of later NPIs

In Figure C10 (left) we show three additional experiments First following Flaxman etal1 we control for the onset of government intervention by introducing an lsquoNPIrsquo that isactive from the day the first NPI is implemented This test is intended to control for potentialconfounding from early unobserved NPIs and behavior changes The result suggests thatthe estimate for school and university closures is not confounded or otherwise biased by thefact that these NPIs were often mandated early However including this covariate increasesthe uncertainty and somewhat decreases the mean estimate for bans of larger gatheringswhich were also often the first NPIs implemented

Second we delay school and university closures by one mean generation interval (roundedto 6 days) This is motivated by the fact that children and adolescents are less likely to

33

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 34: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

show symptoms or die from COVID-19 Their infections may therefore show up with an ad-ditional delay in the case and death data as they must first infect higher-risk demographicsDelaying school and university closures also causes them to be one of the later NPIs in mostcountries a test if their high effectiveness may have been due to their relatively early ap-pearance As the inferred effect is stable we can rule out these potential concerns (FigureC10 left)

Third we relax the assumption that the delay from infection to confirmation is equal be-tween countries It may be longer in countries that mostly test patients in hospitals andnot in the community Therefore we increased the delay by 2 days in countries that didnot offer testing to symptomatic people (using data from the Oxford COVID GovernmentResponse Tracker27) and decreased it by 2 days in countries that did Although we do nothave exact data for these different delays 2 days is plausible based on hospital admissiondata28

In Figure C10 (right) we relabel the NPIs in each country as 1st NPI 2nd NPI etcWe then estimate the effect of these lsquoNPIsrsquo which can represent various actual NPIs Theeffectiveness ranking by order of implementation is 361875249 The result shows thatearlier NPIs are not necessarily estimated to be more effective alleviating our concern thatlater NPIs are less effective due to interacting with earlier NPIs

Appendix C2 Sensitivity to data preprocessing

In the following sections we show sensitivity to data preprocessing steps and additional pa-rameters and reprint sensitivity results from the previous section while additionally show-ing the individual estimates for school and university closures at the bottom of each graph

We have smoothed case and death data with a moving average over a window of plusmn2 daysto incorporate the prior knowledge that large jumps in the data are due to inconsistentreporting Here we show results for smaller and larger smoothing windows including nosmoothing (Figure C11 top right) There is no impact on our conclusions

Furthermore we exclude data in each country before 100 cases and 10 deaths are cumula-tively reached to avoid biasing the model with foreign-imported cases Here we vary thesethresholds Interestingly the death threshold has no clear effect on results whereas thecase threshold does (Figure C11 (bottom left bottom right) Raising the case thresholdup to 500 removes a large portion of our data in March making it difficult to determineR0 (this is not the case for the death threshold) In contrast making the case thresholdtoo low likely introduces bias created by foreign-imported cases and early changes in test-ing regimes with lead to an overestimate of R0 Nonetheless all choices support our mainconclusions (Discussion)

The other subfigures repeat sensitivity analyses from Appendix B with individual effects forschool and universitiy closures shown at the bottom

34

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 35: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Data SourceConfirmed CasesConfirmed DeathsDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Smoothing1 days3 daysDefault (5)7 days15 days

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Minimum Deaths Threshold35Default (10)3050

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Minimum Confirmed Cases Threshold1030Default (100)300500

Figure C11 Additional sensitivity analysis to data variations Top right Sensitivity to smoothing caseand death data over different windows (1 implies no smoothing) Bottom leftright Sensitivity to thethreshold for excluding case and death data

Appendix C3 Sensitivity to additional epidemiological assumptions

For completeness we test sensitivity to two further epidemiological assumptions Theseare 1) The prior on R0mdashwe vary its mean from a very small value (25) to the default value(325) and a large value (45) 2) The standard deviation of the noise σN on the growth rateAs previously discussed this parameter is set to 02 using cross-validation Other choices arenot necessarily reasonable as they lead to miscalibrated predictions (the parameter couldpotentially be inferred from the data but this would be computationally challenging) Weshow these other choices for completeness All choices support our conclusions and allcredible intervals overlap but higher noise as expected leads to higher uncertainty

Results are shown in Figure C12 (middle left middle right)

35

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 36: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Generation Interval= 8= 6= 7= 5= 4

Default ( = 667)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Effectiveness PriorFlaxman et alWideHalf NormalDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

R0 prior= 25= 45

Default ( = 325)

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Reporting DelayCases Delay +4 days

Cases Delay +2 daysCases Delay 4 days

Cases Delay 2 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

Death DelayDeaths Delay +2 daysDeaths Delay +4 days

Deaths Delay 2 daysDeaths Delay 4 daysDefault

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Growth Noisen = 04n = 005n = 01n = 02

Figure C12 Sensitivity to additional and previously shown epidemiological assumptions Middle leftSensitivity to the prior mean on R0 Others Repetition of previously shown results with individual effectfor school and university closures added

36

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 37: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Appendix C4 Additional country exclusions

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultAlbaniaAndorraAustriaBelgiumBosnia and HerzegovinaBulgariaCroatia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultCzech RepublicDenmarkEstoniaFinlandFranceGeorgiaGermany

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultGreeceHungaryIcelandIrelandIsraelItalyLatvia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultLithuaniaMalaysiaMaltaMexicoMoroccoNetherlandsNew Zealand

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closedSchools and universitiesclosedStay-at-home order

Schools closed

Universities closed

DefaultNorwayPolandPortugalRomaniaSerbiaSingaporeSlovakia

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

DefaultSloveniaSouth AfricaSpainSwedenSwitzerlandUnited Kingdom

Figure C13 Sensitivity to excluding one country at a time from the data

37

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 38: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Appendix C5 Validation using predictions in excluded countries

Evaluating predictions on unseen data is an important model validation step even if pre-diction is not the goal of the model29 We use 41-fold cross-validation fitting the model on40 countries and showing its predictions on the excluded country We repeat this processfor all 41 countries In the excluded country the first 14 days of death and case data areobserved to allow roughly inferring R0 These days are not used to infer NPI effectivenessThe activation dates of NPIs are also given and the model uses the effectiveness estimatesinferred from the 40 other countries

Our model makes sensible calibrated forecasts over long periods excluded in countries(Figures C14 to C17)

Explanation of the Figures Vertical lines show the activation (and deactivation) date ofNPIs Shaded areas are 95 credible intervals Left The yellow and green lines (rsquoDailyInfections - Later ReportedLater Fatalrsquo) show the estimates of daily new infections that willturn into confirmed cases (N (C )

t ) and deaths (N (D)t ) Blue and red dots show the observed

confirmed cases and deaths (smoothed) while blue and red lines show the median modelestimates of cases (Ct ) and deaths (D t ) Empty dots are not shown to the model For eachcountry we show the full window of analysis (from the start of the epidemic until the firstNPI was lifted or 30th of May 2020 whichever was earlier (see Methods)

38

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 39: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Albania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Andorra

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Austria

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Belgium

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Bosnia and Herzegovina

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

北 北

Bulgaria

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Croatia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Czech Republic

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Denmark

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Estonia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Finland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

France

Figure C14 Predictions on excluded countries

39

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 40: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Georgia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Germany

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便 便

Greece

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Hungary

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Iceland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Ireland

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Israel

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Latvia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Lithuania

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Malaysia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Malta

Figure C15 Predictions on excluded countries

40

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 41: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Mexico

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Morocco

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Netherlands

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

New Zealand

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Norway

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Poland

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Portugal

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

便便

Romania

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便便

Serbia

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Singapore

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

便 便

Slovakia

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

便便

Slovenia

Figure C16 Predictions on excluded countries

41

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 42: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

South Africa

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

Spain

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY29-MAY

100

101

102

103

104

105

106

Sweden

21-FEB6-MAR

20-MAR3-APR

17-APR

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily Deaths

Daily Infections - Later ReportedPredicted Daily Confirmed Cases

Recorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Switzerland

Figure C17 Predictions on excluded countries Empty dots are not shown to the model 14 initial daysare shown to the model to enable inferring the basic R0 but were not used to infer NPI effectiveness

42

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 43: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Appendix C6 MCMC stability results

1000 1001 1002 10030

500

1000

1500

2000

2500

3000

3500

R

00 05 10 15 200

500

1000

1500

2000

2500

3000

3500

ESS

Figure C18 MCMC stability results Left R-hat statistic Values are close to 1 indicating convergenceRight Relative effective sample size Values of 1 indicate perfect decorrelation between samples Valuesover 1 indicate that the effective number of samples is higher than the actual number of samples (dueto negative correlation) and vice versa

43

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 44: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Appendix C7 Posterior predictive distributions

The posterior predictive distribution (Figure C19) shows the predicted true number of casesand deaths after observing the data Although these curves can be called lsquofitsrsquo the degreeof fit to the data must be interpreted with great care The fit is generally tight but this ispartly due to working with inferred latent noise variables the noise terms ε(C )

t and ε(D)t on

the growth rates g t Inferring this latent noise allows the posterior predictive distributionto closely match the data without overfitting the effectiveness parameters to the data Suchbehavior is common in Bayesian models which often perfectly interpolate the data withoutoverfitting30 The noise terms can account for periods where infections grew faster or slowerthan predicted based solely on the active NPIs In such periods the noise may account forchanges in testing reporting and unobserved interventions

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

100

101

102

103

104

105

106

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

0

1

2

3

4

5

R t

United Kingdom

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

100

101

102

103

104

105

106

Daily Infections - Later FatalPredicted Daily DeathsDaily Infections - Later Reported

Predicted Daily Confirmed CasesRecorded Daily Deaths (Smoothed)Heldout Daily Deaths (Cases)

Recorded Daily Confirmed Cases (Smoothed)Heldout Daily Confirmed Cases (Cases)

Italy

21-FEB6-MAR

20-MAR3-APR

17-APR1-MAY

15-MAY

0

1

2

3

4

5

R t

Italy

Figure C19 Left Posterior predictive distributions for two exemplary countries See text Right In-ferred Rt over time

44

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 45: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Appendix D Additional results

Appendix D1 Estimated R0 by country

Table D2 Estimated values for R0 by country The parenthesis give the 95 credible interval The meanR0 across countries is 38

Country Estimated R0 Country Estimated R0

Albania 392 (285525) Lithuania 355 (255478)Andorra 281 (203374) Malaysia 352 (267453)Austria 359 (275459) Malta 364 (2525)Belgium 433 (344539) Mexico 47 (358604)Bosnia and Herzegovina 374 (273493) Morocco 443 (344568)Bulgaria 432 (326563) Netherlands 353 (277441)Croatia 393 (293515) New Zealand 265 (182365)Czech Republic 415 (306546) Norway 296 (221389)Denmark 292 (217378) Poland 487 (369631)Estonia 288 (211378) Portugal 44 (34558)Finland 296 (221384) Romania 474 (37597)France 391 (314481) Serbia 444 (335574)Georgia 43 (319569) Singapore 353 (272445)Germany 351 (273443) Slovakia 383 (264529)Greece 339 (256439) Slovenia 325 (23244)Hungary 474 (366607) South Africa 565 (446713)Iceland 159 (098238) Spain 471 (38571)Ireland 442 (348556) Sweden 225 (17291)Israel 439 (334569) Switzerland 307 (236392)Italy 412 (335501) United Kingdom 394 (317483)Latvia 327 (233442)

45

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 46: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Appendix D2 The individual effects of school and university closures

The dates of school and university closures coincide nearly perfectly for every country ex-cept Iceland and Sweden which closed universities but not schools (Figure 1) As a con-sequence the inferred individual effects depend strongly on the in- or exclusion of thesecountries in the dataset (Figure D20) We conclude that we cannot meaningfully disentan-gle these two NPIs based on only two countries and show their joint effect (Figure 3) forwhich there is much more data

-50 -25 0 25 50 75 100Average additional reduction in R in the context of our data

Schools closed

Universities closed

Schools and univerisities closed

Schools and Universities SensitivityIceland ExcludedSweden ExcludedDefault

Figure D20 The individual effectiveness of closing schools and of closing universities as well as thejoint effect of closings school and universities estimated on all countries (default) all countries exceptSweden and all countries except Iceland Median 50 and 95 credible intervals are shown Theindividual effect of school closures is highly sensitive to the in- or exclusion of one of the countries Ifwe include all countries we would conclude that the joint effect is mostly driven by university closuresHowever if we exclude Sweden we would conclude that schools closures play a larger role As there is nostrong justification for in- or excluding one particular country we conclude that we cannot meaningfullydisentangle the effects of school and university closures The combined effect is more stable

Appendix D3 Collinearity

Table D3 shows the total number of days across all countries available to distinguish NPIeffects For every pair of NPIs (row - column) the entry shows the number of country-days on which only one of the NPIs was implemented (but not both or none) Note thatwe do not show the traditional collinearity statistics variance inflation factors and datacorrelations since their applicability to time series data is limited In particular the valueof these statistics in our data increases as data for a longer time period becomes availablewhich would misleadingly suggest that we could address problems from collinearity byusing less data

46

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 47: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Table D3 Total number of days across all countries available to distinguish NPI effects For every pair ofNPIs (row - column) the entry shows the number of country-days on which exactly one of the NPIs wasimplemented Abbreviations G Gatherings SBC Some businesses closed MBC Most nonessentialbusinesses closed SaUC Schools and universities closed SaHO Stay-at-home order

G lt1000 G lt100 G lt10 SBC MBC SaUC SaHOMask-wearing 1829 1686 1472 1554 1315 1588 1038Gatherings lt1000 143 501 299 620 325 1173Gatherings lt100 358 240 515 284 1030Gatherings lt10 262 403 308 696Some businesses closed 331 176 898Most businesses closed 393 569Schools and universities closed 940

Appendix D4 Correlations between effectiveness estimates

The effectiveness parameters αi are typically negatively correlated with each other for NPIswhich are often used together reflecting uncertainty about which NPI is reducing R Ex-cessive collinearity in the data would result in wide posterior credible intervals with strongcorrelations20 but we find weak posterior correlations between effectiveness estimatesThe strongest correlation between any pair of NPIs is minus041 between limiting gatherings to100 attendants and 1000 attendants or less (Figure D21) The weak correlations are oneindicator that collinearity is manageable with our dataset

To better understand posterior correlations we visualize their effect in hosted video filesAs we condition on different values for one NPI we can see that the estimates of otherNPIs change only slightly always staying well within the credible intervals in Figure 3The significance of posterior correlations is small enough that it is possible to calculatea reasonable approximation to the mean effect of a set of NPIs by simply combining themean percentage reductions for each individual NPI (eg two 50 reductions lead to a75 reduction) For example this approximation leads to a joint effect of 82 for all NPIstogether which matches the exact mean joint effect (also 82) up to rounding error

Videos are available online here

47

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 48: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Mask-wearing

Gatherings lt1000

Gatherings lt100

Gatherings lt10

Some businesses closed

Most businesses closed

Schools and univeristies closed

Stay-at-home order

Posterior Correlation

100

075

050

025

000

025

050

075

100

Figure D21 Posterior correlations between effectiveness parameters αi

48

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 49: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Appendix E Additional discussion of assumptions and limitations

Appendix E1 Limitations of the data

We only record NPIs if they are implemented in most of a country (if they affect more thanthree fourths of the population) We thus miss if NPIs were only implemented regionallyFor example a few regions in Germany implemented stay-at-home orders but most didnot Thus Germany is listed as no stay-at-home order in our data Additionally our NPIdefinitions were not perfectly granular For example a gathering ban on gatherings of gt15people and a ban on gatherings of gt60 people would both fall under the NPI Gatheringslimited to 100 people or less despite likely having different effects on R Finally whilewe included more NPIs than previous work (Table F4) there are many NPIs for which wewere not able to collect enough high-quality data for our modeling such as public cleaningor changes to public transportation

Appendix E2 Model limitations

Independence of country and time We assume that the effect of NPIs on R is constantacross countries and time However the exact implementation and adherence of each NPIsis likely to vary Our uncertainty estimates in Figure 3 account for these problems only toa strictly limited degree Additionally different countries have different cultural norms andage profiles affecting the degree to which a particular intervention is effective For examplea country where a higher proportion of the population is in education will likely observea larger effect from a government order to close schools and universities Our estimatesthus should be adjusted to local circumstances To address differences between countriesour structural sensitivity analysis includes a model where each NPI can have a differenteffect per country (Appendix C) The average effectiveness estimates across countries inthis model match the conclusions from our default model

Testing reporting and the IFR Our model can account for differences in testing (andIFRreporting) between countries and over time as discussed in Appendix A Howeverwe have not used additional data on testing to validate if it does so reliably Our model maystruggle to account for changes in the testing regimemdashfor instance when a country reachesits testing capacity so that the ascertainment rate declines exponentially An exponentialdecline would have the same effect on observations as an unobserved NPI Consequentlywe cannot quantify its effect on our results (though the sensitivity analyses look promising)

Interaction between NPIs As discussed in the Results section our model reports the aver-age additional effect each NPI had in the contexts where it was active in our data (in thesense mathematically shown by Sharma et al9) Figure 3 (bottom left) summarises thesecontexts aiding interpretation The effectiveness of an NPI can only be extrapolated toother contexts if its effect does not depend on the context For example we may expect

49

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 50: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

that closing schools has a similar effectiveness whether or not businesses are also closedBut wearing masks in public may be less effective when a stay-at-home order limits publicinteractions

Growth rates The functional form of the relationship between the daily growth rate of thenumber of infections g and the reproductive number R holds exactly when the epidemicis in its exponential growth phase but becomes less accurate as the number of susceptiblepeople in a population decreases andor control measures are implemented However wealso reported results from a renewal process model8 that lacks this assumption and findssimilar effectiveness estimates

Signalling effect of NPIs As we explained in the Discussion for school closures we do notdistinguish between the direct effect of an NPI and its indirect effect as it signals the gravityof the situation to the public Conversely lifting interventions may also have a signallingeffect

Homogeneous effect of interventions We work under the implicit assumption that NPIs af-fect different population groups equally This could affect results in various ways For exam-ple suppose country A tests an older demographic than country B and we are consideringthe effect of an NPI that mostly affects the older demographic (for example isolating theelderly) Then the NPI will appear to have a greater effect on confirmed cases in country Abreaking the assumption that effects are constant across countries Our previous discussionof interpreting results when this assumption is violated applies

Appendix F Overview of previous work

Table F4 Existing data-driven studies of the effectiveness of observed (as opposed to hypothetical) NPIsin reducing the transmission of COVID-19

Study NPIs studiedRegionscountries

studiedMethod

Flaxman etal 20201

School or universityclosure case-basedisolation ban on

large public eventssocial distancing

lockdown

Austria BelgiumDenmark France

Germany Italy NorwaySpain SwedenSwitzerland UK

Semi-mechanisticBayesian hierarchical

model

Continued on next page

50

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 51: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Chen andQiu 202031

Travel restrictionmask-wearing

lockdown socialdistancing school

closure centralizedquarantine

Italy Spain GermanyFrance UK Singapore

South Korea China US

Regression withdelayed effect

Susceptible-Infectious-Removed (SIR)

model

Banholzer etal 202032

School closureborder closure eventban gathering ban

venue closurelockdown work ban

US Canada AustraliaAustria Belgium

Denmark FinlandFrance Germany Greece

Ireland ItalyLuxembourg the

Netherlands PortugalSpain Sweden UKNorway Switzerland

Semi-mechanisticBayesian hierarchical

model

Hsiang et al202033

Restricting travel (5subcategories)distancing (10subcategories)quarantine and

lockdown (2subcategories)

additional policies (2subcategories)

China South Korea ItalyIran France US

Linear regression onestimated growth

rates

Choma et al202034

Single aggregatedNPI

22 countries and 25 states

Regression withSusceptible-Infectious-

Removed-Deceased(SIRD) model

Dehning etal 202035

Contact banrestrictions on

gatherings schoolschildcare businesses

GermanyBayesian inference of

transmission rate

Siedner etal 202036

General socialdistancing

USInterruptedtime-series

Kraemer etal 202037

Travel restrictionsand cordon sanitaire

China Regression

Kucharski etal 202038 Travel restrictions Wuhan (China)

Various includingSusceptible-Exposed-Infectious-Removed

(SEIR) modelContinued on next page

51

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 52: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Table F4 ndash Continued from previous page

Study NPIs studiedRegionscountries

studiedMethod

Dandekarand

Barbastathis202039

General quarantineand isolation

Wuhan Italy SouthKorea and US

A mix of a mechanisticmodel and a

data-driven neuralnetwork model

Maier andBrockmann

202040

General quarantineand isolation

Mainland ChinaQuantitative fits to

empirical data

Sears et al202041

Mobility changes as aproxy for

stay-at-homemandates

USDifference-in-

differences statisticalmodel

Jarvis et al202042

Physical (social)distancing measures

UKQuestionnaire dataand compartmental

epidemic modelOrea andAacutelvarez202043

Lockdown SpainSpatial econometric

analysis

Lorch et al202044

Mobility restrictionstesting amp tracing

social distancing andbusiness restrictions

Tuumlbingen (Germany)Authorsrsquo own

spatiotemporal modelof epidemics

Gatto et al202045

Various restrictions tomobility and

human-to-humaninteractions

Italy

SusceptiblendashExposedndashInfectedndashRecovered(SEIR)-like diseasetransmission model

Quilty et al202046

Intercity travelrestrictions

Beijing ChongqingHangzhou and Shenzhen

(Mainland China)

Branching processtransmission model

52

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 53: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Appendix G Handling edge cases in the data collection

In our data collection process we relied on carefully worded definitions of 9 different NPIs(Table F4) which allowed us to systematically determine the date on which a countryimposed an NPI and if applicable the date the NPI was lifted

In some cases however we faced ambiguities in how to interpret the start date of an NPIOne kind of challenge arose when descriptions of policy measures were less specific thanour NPI definitions (eg a ban on ldquolarge gatheringsrdquo that does not specify the exact numberof people that constitutes a ldquolarge gatheringrdquo) Another difficulty was due to NPI policiesthat made distinctions that we did not make in our own NPI definitions (eg an NPI policythat made a distinction between the number of people able to gather indoors vs outdoors)

To resolve these ambiguities in a consistent manner our researchers developed a set of prin-ciples and guidelines that were followed during the data collection process For each of theexamples below the relevant sources are available in the data table in the supplementarymaterial

Situation Sometimes only public gatherings are banned with no explicit ban on pri-vate gatherings

How we deal with it We still counted this as a ban on gatherings

Examples

bull Sweden In Sweden they banned all public gatherings of more than 50 people (demon-strations religious meetings theater performances markets and other events thatrelied on the constitutional freedom of assembly) however the ban did not have amandate to prohibit private gatherings (such as private parties) We counted this as aban on gatherings

bull Finland In Finland they banned all public gatherings of more than 10 people onthe 16th of March Although formal restrictions did not apply to private gatheringsthis policy met our definition of a ban on gatherings (Note that this inclusion seemsparticularly valid in light of the fact that according to Finnish police the formalrestrictions on public events were widely interpreted to apply to private gatherings aswell and there were very few reports of large private parties despite the absence offormal restrictions)

Situation The size limits on gatherings sometimes differ between indoor and outdoorgatherings

How we deal with it In these cases we relied on the limitations on indoor events as theseevents entail a greater risk of transmission

Example

53

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 54: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

bull Spain In Spain a range of rules were employed as the country gradually eased re-strictions on gatherings In phase 1 cultural events were permitted with up to 30people indoors and up to 200 outdoors This was counted as ldquoGatherings limited to100 people or lessrdquo

Situation The size limit on gatherings sometimes differs between different types ofgatherings

How we deal with it In this case researchers would use their best judgment to infer whetherthe restriction would apply to most gatherings of a given size

Example

bull Spain In Spain phase 1 of the reopening allowed for cultural events to have up to30 participants indoors while social gatherings were limited to 10 people In thiscase since ldquocultural eventsrdquo is broad we counted this as a case of ldquogatherings limitedto 100 people or lessrdquo However if for example all gatherings above 5 people hadbeen banned with an exception for funerals we would have counted this as ldquogather-ings limited to 10 people or lessrdquo since the exemption only applied to a minority ofgatherings

Situation Limitations on gathering sizes are not clearly given yet a policy stating thatldquolarge events are bannedrdquo is in place

How we deal with it Our researchers used the relevant context to infer the most likely scopeof the policy

Example

bull Albania on March 8 ldquoauthorities had also ordered cancellations of all large publicgatherings including cultural events and were asking sporting federations to cancelscheduled matchesrdquo The events that are mentioned here are multi-thousand persongatherings and so we took March 8th to be the start date of ldquoGatherings limited to1000 people or lessrdquo However it was unclear whether gatherings of 100-1000 wouldalso have been banned so we did not yet say that ldquoGatherings limited to 100 peopleor lessrdquo was instantiated

Situation Only some schools were closed or schools reopened gradually

How we deal with it Since our definition of the NPI is that ldquoMost schools are closedrdquo we didnot count the closure of just a few schools or school years sufficient to meet this criteriaSimilarly if schools reopened for only a very limited number of year groups for examplefor final year students sitting exams we did not count this as a lifting of the ldquomost schoolsclosedrdquo NPI

54

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 55: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Examples

bull Sweden Sweden kept all schools through 9th grade open but closed high schools(gt16 year olds) In this case we did not count this as ldquoMost schools closedrdquo sincemore than 75 of students are below 9th grade

bull Czech Republic After closing all schools on the March 13 the Czech Republic allowedschools to reopen for teaching in some contexts from May 11 (specifically for studentsin their final year of primary school or high school preparing for exams) Howeverwe still counted this as ldquoMost schools closedrdquo since the majority of students were notin school We recorded the end date for school closure to be June 8 when all schoolsreopened

Situation In a country where most non-essential businesses were closed the liftingof business closures is gradual and businesses in different sectors are successivelyallowed to open

How we deal with it Countries reopen sectors in different idiosyncratic ways and succes-sions Given the available data it is not feasible to create a principle that can be appliedunambiguously to every single case without some involvement of researcher judgment Thegeneral guideline we used was If only a few low-risk businesses (eg bike stores hard-ware stores etc) are additionally allowed to reopen then we still counted this as ldquoMostnonessential businesses closedrdquo However if any one of the following criteria are met thenwe counted ldquoMost nonessential businesses closedrdquo as having lifted but the ldquoSome businessesclosedrdquo NPI was still in place

bull All regular retail stores with only a few exceptions eg size limitations are openbull Contact-based services such as hairdressers and tattoo parlors are openbull Restaurants and bars are open and serving indoors

We decided that meeting any one of these criteria is a sufficient condition for taking a coun-try from ldquoMost nonessential businesses closedrdquo to ldquoSome businesses closedrdquo This heuristicwas partly based on the fact that the status of these categories appeared to be consistentlycorrelated meaning that even in the absence of complete specifications as to what hadreopened or not it was typically possible to infer the overall level of reopening based oneither of these categories Meeting at least one of these criteria was considered a necessarycondition for ending the ldquoMost nonessential businesses closedrdquo NPI

Examples

bull Slovakia On April 22 retail operations and services up to 300 m2 opened Since thismeets one of the sufficient conditions we counted April 22 as the end date for ldquoMostnonessential businesses closedrdquo

bull Ireland On May 18 the following reopened hardware stores builders merchantsand those providing essential supplies retailers involved in the sale and repair ofvehicles certain office supply stores Because this white list does not meet any of the

55

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 56: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

three criteria Irelandrsquos end date for ldquoMost nonessential businesses closedrdquo was notcounted as May 18

bull Czech Republic On April 20 several businesses reopened including farmerrsquos mar-kets marketplaces locksmiths bike shops car dealers electronics stores At thispoint none of the criteria were met so we recorded the Czech Republic as still hav-ing ldquoMost nonessential businesses closedrdquo On May 11 a long list of businesses re-opened including barbers hairdressers museums all establishments in sufficientlylarge shopping centers shows with up to 100 participants and restaurants with awindow facing the street Since contact-based services (hairdressers) and all retailestablishments in sufficiently large spaces were allowed to reopen we counted May11 as the end date for the ldquoMost nonessential businesses closedrdquo NPI

bull Croatia On April 27 all trade activities (except within shopping malls) service jobsthat donrsquot involve physical contacts museums libraries galleries opened Since thecriteria regarding lsquoall retail stores being openrsquo was met we counted April 27 as theend date for ldquoMost nonessential businesses closedrdquo

56

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 57: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

Appendix H References

References

1 Seth Flaxman et al ldquoEstimating the effects of non-pharmaceutical interventions onCOVID-19 in Europerdquo In Nature (2020) pp 1ndash8

2 Suryakant Yadav and Pawan Kumar Yadav ldquoBasic Reproduction Rate and Case FatalityRate of COVID-19 Application of Meta-analysisrdquo In COVID-19 SARS-CoV-2 preprintsfrom medRxiv and bioRxiv (May 2020) DOI 1011012020051320100750 URLhttpswwwmedrxivorgcontent1011012020051320100750v1

3 J Wallinga and M Lipsitch ldquoHow generation intervals shape the relationship betweengrowth rates and reproductive numbersrdquo In Proceedings of the Royal Society B Biolog-ical Sciences 2741609 (Nov 2006) pp 599ndash604 DOI 101098rspb20063754

4 D Cereda et al ldquoThe early phase of the COVID-19 outbreak in Lombardy Italyrdquo In(Mar 20 2020) arXiv 200309320v1 [q-bioPE] URL httpsarxivorgabs200309320

5 John M Griffin et al ldquoA rapid review of available evidence on the serial interval andgeneration time of COVID-19rdquo In medRxiv (2020)

6 Tapiwa Ganyani et al ldquoEstimating the generation interval for coronavirus disease (COVID-19) based on symptom onset data March 2020rdquo In Eurosurveillance 2517 (2020)p 2000257

7 Luca Ferretti et al ldquoQuantifying SARS-CoV-2 transmission suggests epidemic controlwith digital contact tracingrdquo In Science 3686491 (2020)

8 Christophe Fraser ldquoEstimating individual and household reproduction numbers in anemerging epidemicrdquo In PloS one 28 (2007)

9 Mrinank Sharma et al ldquoOn the Robustness of Effectiveness Estimation of Nonpharma-ceutical Interventions Against COVID-19 Transmissionrdquo In Arxiv (2020)

10 Andrew Gelman Jessica Hwang and Aki Vehtari ldquoUnderstanding predictive informa-tion criteria for Bayesian modelsrdquo In Statistics and computing 246 (2014) pp 997ndash1016

11 Natalie M Linton et al ldquoIncubation period and other epidemiological characteristics of2019 novel coronavirus infections with right truncation a statistical analysis of publiclyavailable case datardquo In Journal of clinical medicine 92 (2020) p 538

12 Qun Li et al ldquoEarly Transmission Dynamics in Wuhan China of Novel CoronavirusndashInfected Pneumoniardquo In New England Journal of Medicine 38213 (Mar 2020) pp 1199ndash1207 DOI 101056nejmoa2001316

13 Qifang Bi et al ldquoEpidemiology and Transmission of COVID-19 in Shenzhen ChinaAnalysis of 391 cases and 1286 of their close contactsrdquo In COVID-19 SARS-CoV-2preprints from medRxiv and bioRxiv (Mar 2020) DOI 1011012020030320028423URL httpswwwmedrxivorgcontent1011012020030320028423v3

57

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 58: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

14 Robert Verity et al ldquoEstimates of the severity of coronavirus disease 2019 a model-based analysisrdquo In The Lancet Infectious Diseases (Mar 2020) DOI 101016s1473-3099(20)30243-7

15 Simon Duane et al ldquoHybrid Monte Carlordquo In Physics Letters B 1952 (Sept 1987)pp 216ndash222 DOI 1010160370-2693(87)91197-x

16 Matthew D Hoffman and Andrew Gelman ldquoThe No-U-Turn Sampler Adaptively Set-ting Path Lengths in Hamiltonian Monte Carlordquo In Journal of Machine Learning Re-search 1547 (2014) pp 1593ndash1623 URL httpjmlrorgpapersv15hoffman14ahtml

17 Ying Liu et al ldquoThe reproductive number of COVID-19 is higher compared to SARScoronavirusrdquo In Journal of travel medicine (2020)

18 John Salvatier Thomas V Wiecki and Christopher Fonnesbeck ldquoProbabilistic program-ming in Python using PyMC3rdquo In PeerJ Computer Science 2 (2016) e55

19 A Gelman et al ldquoBayesian Data Analysis Second Editionrdquo In Chapman amp HallCRCTexts in Statistical Science Taylor amp Francis 2003 Chap Model checking and im-provement ISBN 9781420057294 URL httpsbooksgooglecommxbooksid=TNYhnkXQSjAC

20 Carsten F Dormann et al ldquoCollinearity a review of methods to deal with it and asimulation study evaluating their performancerdquo In Ecography 361 (2013) pp 27ndash46

21 Anne Cori et al ldquoA new framework and software to estimate time-varying reproduc-tion numbers during epidemicsrdquo In American journal of epidemiology 1789 (2013)pp 1505ndash1512

22 Pierre Nouvellet et al ldquoA simple approach to measure transmissibility and forecastincidencerdquo In Epidemics 22 (2018) pp 29ndash35

23 Simon Cauchemez et al ldquoEstimating the impact of school closure on influenza trans-mission from Sentinel datardquo In Nature 4527188 (2008) pp 750ndash754

24 P R Rosenbaum and D B Rubin ldquoAssessing Sensitivity to an Unobserved Binary Co-variate in an Observational Study with Binary Outcomerdquo In Journal of the Royal Sta-tistical Society Series B (Methodological) 452 (1983) pp 212ndash218 DOI 101111j2517-61611983tb01242x eprint httpsrssonlinelibrarywileycomdoipdf101111j2517-61611983tb01242x URL httpsrssonlinelibrarywileycomdoiabs101111j2517-61611983tb01242x

25 James M Robins Andrea Rotnitzky and Daniel O Scharfstein ldquoSensitivity analysis forselection bias and unmeasured confounding in missing data and causal inference mod-elsrdquo In Statistical models in epidemiology the environment and clinical trials Springer2000 pp 1ndash94

26 A Gelman and J Hill ldquoCausal inference using regression on the treatment variablerdquo InData Analysis Using Regression and MultilevelHierarchical Models (2007)

27 Thomas Hale et al Oxford COVID-19 Government Response Tracker Blavatnik Schoolof Government https www bsg ox ac uk research research - projects coronavirus-government-response-tracker 2020

58

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 59: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

28 Eva S Fonfria et al ldquoEssential epidemiological parameters of COVID-19 for clinicaland mathematical modeling purposes a rapid review and meta-analysisrdquo In medRxiv(2020)

29 Andrew Gelman and Cosma Rohilla Shalizi ldquoPhilosophy and the practice of Bayesianstatisticsrdquo In British Journal of Mathematical and Statistical Psychology 661 (Feb2012) pp 8ndash38 DOI 101111j2044-8317201102037x

30 Carl Edward Rasmussen ldquoGaussian processes in machine learningrdquo In Summer Schoolon Machine Learning Springer 2003 pp 63ndash71

31 Xiaohui Chen and Ziyi Qiu ldquoScenario analysis of non-pharmaceutical interventions onglobal COVID-19 transmissionsrdquo httpsarxivorgabs200404529 Apr 7 2020

32 Nicolas Banholzer et al ldquoImpact of non-pharmaceutical interventions on documentedcases of COVID-19rdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (Apr2020) DOI 1011012020041620062141 URL httpswwwmedrxivorgcontent1011012020041620062141v3

33 Solomon Hsiang et al ldquoThe Effect of Large-Scale Anti-Contagion Policies on the Coro-navirus (COVID-19) Pandemicrdquo In medRxiv (May 2020) p 2020032220040642DOI 1011012020032220040642

34 Jacques Naude et al ldquoWorldwide Effectiveness of Various Non-Pharmaceutical Inter-vention Control Strategies on the Global COVID-19 Pandemic A Linearised ControlModelrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv and bioRxiv (May 2020)DOI 1011012020043020085316 URL httpswwwmedrxivorgcontentearly202005122020043020085316

35 Jonas Dehning et al ldquoInferring change points in the spread of COVID-19 reveals theeffectiveness of interventionsrdquo In Science (2020)

36 Mark J Siedner et al ldquoSocial distancing to slow the US COVID-19 epidemic an in-terrupted time-series analysisrdquo In COVID-19 SARS-CoV-2 preprints from medRxiv andbioRxiv (Apr 2020) DOI 1011012020040320052373 URL httpswwwmedrxivorgcontent1011012020040320052373v2

37 Moritz U G Kraemer et al ldquoThe effect of human mobility and control measures on theCOVID-19 epidemic in Chinardquo In Science 3686490 (Mar 2020) pp 493ndash497 DOI101126scienceabb4218

38 Adam J Kucharski et al ldquoEarly dynamics of transmission and control of COVID-19 amathematical modelling studyrdquo In The Lancet Infectious Diseases 205 (May 2020)pp 553ndash558 DOI 101016s1473-3099(20)30144-4

39 Raj Dandekar and George Barbastathis ldquoNeural Network aided quarantine controlmodel estimation of global Covid-19 spreadrdquo In (Apr 2 2020) arXiv 200402752v1[q-bioPE] URL httpsarxivorgabs200402752

40 Benjamin F Maier and Dirk Brockmann ldquoEffective containment explains subexponen-tial growth in recent confirmed COVID-19 cases in Chinardquo In Science 3686492 (Apr2020) pp 742ndash746 DOI 101126scienceabb4557

59

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References
Page 60: The effectiveness of eight nonpharmaceutical …...2020/05/28  · Besiroglu10, Hong Ge PhD11, Vladimir Mikulik5, Meghan A. Hartwick PhD12, Prof Yee Whye Teh PhD13, Prof Leonid Chindelevitch

41 Sofia B Villas-Boas et al Are We StayingHome to Flatten the Curve Tech rep UCBerkeley Department of Agricultural and Resource Economics 2020

42 Christopher I Jarvis et al ldquoQuantifying the impact of physical distance measures onthe transmission of COVID-19 in the UKrdquo In BMC Medicine 181 (May 2020) DOI101186s12916-020-01597-8

43 Luis Orea and Inmaculada Aacutelvarez How effective has been the Spanish lockdown to battleCOVID-19 A spatial analysis of the coronavirus propagation across provinces WorkingPapers 2020-03 FEDEA 2020 URL httpdocumentosfedeanetpubsdt2020dt2020-03pdf

44 Lars Lorch et al ldquoA Spatiotemporal Epidemic Model to Quantify the Effects of ContactTracing Testing and Containmentrdquo In (Apr 15 2020) arXiv 200407641v2 [csLG]URL httpsarxivorgabs200407641

45 Marino Gatto et al ldquoSpread and dynamics of the COVID-19 epidemic in Italy Effects ofemergency containment measuresrdquo In Proceedings of the National Academy of Sciences11719 (Apr 2020) pp 10484ndash10491 DOI 101073pnas2004978117

46 Billy J Quilty et al ldquoThe effect of inter-city travel restrictions on geographical spreadof COVID-19 Evidence from Wuhan Chinardquo In COVID-19 SARS-CoV-2 preprints frommedRxiv and bioRxiv (2020) DOI 1011012020041620067504 eprint httpswwwmedrxivorgcontentearly202004212020041620067504fullpdf URL httpswwwmedrxivorgcontentearly202004212020041620067504

60

CC-BY-NC 40 International licenseIt is made available under a is the authorfunder who has granted medRxiv a license to display the preprint in perpetuity (which was not certified by peer review)

The copyright holder for this preprint this version posted July 23 2020 httpsdoiorg1011012020052820116129doi medRxiv preprint

  • Appendix
    • Modelling details
      • Detailed model description
      • Technical Model Description
        • Validation
          • Unseen data
          • Sensitivity Analysis
          • Robustness to unobserved effects
            • Additional sensitivity analyses and validation
              • Role of NPI timing
              • Sensitivity to data preprocessing
              • Sensitivity to additional epidemiological assumptions
              • Additional country exclusions
              • Validation using predictions in excluded countries
              • MCMC stability results
              • Posterior predictive distributions
                • Additional results
                  • Estimated R0 by country
                  • The individual effects of school and university closures
                  • Collinearity
                  • Correlations between effectiveness estimates
                    • Additional discussion of assumptions and limitations
                      • Limitations of the data
                      • Model limitations
                        • Overview of previous work
                        • Handling edge cases in the data collection
                        • References