Network Dynamics and Simulation Science Laboratory
Interaction Based Computer Modeling forComprehensive Incident Characterization to
Support Pandemic Preparedness
Madhav V. MaratheNetwork Dynamics and Simulation Science Laboratory
Virginia Bio-Informatics Institute & Dept. of Computer ScienceVirginia Tech
NDSSL-TR-06-019Web Site: http://ndssl.vbi.vt.edu
Network Dynamics and Simulation Science Laboratory
These slides are a version of the invited lecture given on April 22nd asa part of Workshop on Spatial Data Mining held as a part of theAnnual SIAM Data Mining Conference, April 2006.
Organizers: Professors Naren Ramakrishnan, Virginia Tech & ChrisBailey-Kellogg, Dartmouth University
Venue & Date: Washington DC April 22nd 2006.
Network Dynamics and Simulation Science Laboratory
Acknowledgements
Virginia Tech: Members, Network Dynamics & Simulation Science Laboratory, VBI Karla Atkins, Christopher L. Barrett, Richard Beckman, Keith Bisset, JiangZhou Chen,
Stephen Eubank, V.S. Anil Kumar, Achla Marathe, Henning Mortveit, Paula Stretz,External Collaborators:
Aravind Srinivasan (U. Maryland), Nan Wang (Morgan Stanley), Zoltan Toroczkai(NotreDame), Doug Roberts (RTI)
Work funded in part by National Institute of General Medical Sciences, National Institutes of Health, “Pilot Projectsfor Models of Infectious Disease Agent Study (MIDAS)” contract U01-GM070694.
Network Dynamics and Simulation Science Laboratory
Avian Influenza
http://www.zephyr.dti.ne.jp/~john8tam/main/Library/influenza_site/influenza_virus.jpg
•Avian influenza -- infection caused by avian(bird) influenza (flu) viruses
•Spreads thru saliva, nasal secretions, and feces,
•Categorized based on surface proteinshemagglutinin [HA] and neuraminidase [NA]
•Person to person has been limited and has notcontinued beyond one person.
•Antigenic “shift” vs “drift”
–Little historical information on evolutionof virus in animals
–In circulation: H1N1 (1918) & H3N2(1968) (has displaced H2N2 (1957))
Network Dynamics and Simulation Science Laboratory
Intervention Strategies for Human Influenza
• Vaccine
– Vaccine based on circulating avian flu [oops, shift]
– Vaccine based on pandemic strain [takes 4-6 months]
• Antivirals
– Amantadines(resistance develops rapidly )– Neuraminidase inhibitors: oseltamivir (TamiFlu) and zanamivir (raw
material?)
– Prophylaxis (only while doses received)
• Social distancing
– Isolation and quarantine of cases– Reduced contacts for others
Network Dynamics and Simulation Science Laboratory
Classical ODE Model: Coupled Rate Equation withComplete Mixing
S
R I
• Partition the population into compartments– Susceptible– Infected– Removed
• Assume mass action(uniform mixing) β
γKermack, W. O. and McKendrick, A. G. Proc. Roy. Soc. Lond. A 115, 700-721, 1927. Also see May, Hethcote, Heesterbeek, Murray,….
Network Dynamics and Simulation Science Laboratory
Are ODE Models Up to the Task ?
• ODEs lack agency and heterogeneity of contact structure– True complexity stems from interactions among many discrete actors– Each kind of interaction must be explicitly modeled– Refinement is difficult
• ODE models say very little about transients
• Use constants such as R0 as if they are universal constants– E.g. what is the R0 for smallpox?
• Human behavioral issues– Inhomogeneous compliance
– Changes in the face of crisis
• Are not naturally suited for formulating and testing policies that can beimplemented– E.g. reduce contact rates by 30% --- is an ambiguous statement
Network Dynamics and Simulation Science Laboratory
An Alternative Approach -- Network BasedRepresentation
•Initially node 1 is infected. Each node runs SIR model
•At each step an infected node infects its uninfectedneighbors with probability p and this is doneindependently
•A node is infected if one or more of its neighborsuccessfully infects it
•Node remains infected for 2 steps and then recovers
t=0 t=1
Network Dynamics and Simulation Science Laboratory
Models Shape Thinking
Compartmental models suggest“Reduce contact rates at work by 50%”
Three different ways to reduce rates resulting in very different networks
Effect of decreasing fidelity in social networks (above)on disease dynamics (below)
Network Dynamics and Simulation Science Laboratory
Taking this one step further -- A Bottom up HighlyResolved Simulation
• Create a proto-population(s)– Humans, animal reservoirs, …
• Refine it by adding necessary attributes– Disease model
– Demographics
– Behavioral response to outbreak
– Critical workers
– Interactions with other infrastructures
• Simulate to develop correlations among attributes• Test intervention scenarios
Network Dynamics and Simulation Science Laboratory
Simfrastructure: Modeling InterdependentInfrastructures
• A collection of interoperable simulationsof societal infrastructures
• each mimics the time-dependentinteractions of every individual in aregional area with the built environment(critical infrastructures)
• based on:– where they are,– why they are there,
– what they are doing,– how they got there, and– who they went with
Network Dynamics and Simulation Science Laboratory
Phase 1: Creating a dynamic socialcontact networks
Creating a synthetic microscopic urban environment– Locating every individual in a city every second at the scale of block
group
– Individuals have statistically correct demographic characteristics
– Built infrastructure synthesized to provide the context
Network Dynamics and Simulation Science Laboratory
Data Fusion from Multiple Sources
Network Dynamics and Simulation Science Laboratory
Example Synthetic Household(ca.1990)
Age 26 26 7
Income $27k $16k $0
Status worker worker student
Automobile
Network Dynamics and Simulation Science Laboratory
Example Located SyntheticPopulation
Example Route Plans
HOME
WORK
LUNCH
WORK
DOCTOR
SHOP
HOME
HOME
WORK
SHOP
second person in household
first person in household
Network Dynamics and Simulation Science Laboratory
Routing and Microsimulation
Route Density by timeRoute Density by time
Microsimulation Microsimulation of vehiclesof vehicles
Network Dynamics and Simulation Science Laboratory
Synthetic Social Contact Networks
Edge attributes:Edge attributes:•• activity type: shop, work, school activity type: shop, work, school•• (start time 1, end time 1) (start time 1, end time 1)•• (start time 2, end time 2) (start time 2, end time 2)•• ……
Vertex attributes:Vertex attributes:•• (x,y,z) (x,y,z)•• land use land use•• ……
Locations (1 million)Locations (1 million)
Vertex attributes:Vertex attributes:•• age age•• household size household size•• gender gender•• income income•• ……
People (8 million)People (8 million)
Network Dynamics and Simulation Science Laboratory
Various Projections of People-LocationTemporal Social Visitation Network
11.00-12.0011.00-12.00 11.0011.00
GGLL
10.0010.009.009.00
GGSSLL
GGSSPP
08.00-10.0008.00-10.00
11.00-12.0011.00-12.00
09.00-10.0009.00-10.00
GGPLPL
08.00-12.0008.00-12.00
08.00-12.0008.00-12.00
10.00-12.0010.00-12.00
10.00-11.0010.00-11.00
08.00-09.0008.00-09.00
GGPP
08.00-9.0008.00-9.0008.00-10.0008.00-10.00
10.00-12.0010.00-12.00
TemporalProjections
StaticProjections Location
ProjectionsPeopleProjections
Network Dynamics and Simulation Science Laboratory
Phase 2: Modeling Disease Dynamics
Network Dynamics and Simulation Science Laboratory
Within and Among Host Disease Model: CoupledProbabilistic Timed Transition System
• On each time step, each person’s state of health can change– susceptible -> infectious in t time steps with probability depending
on health and duration of contacts in a social network– infectious -> removed or recovered and thus susceptible after t time step (dependency is
stochastic and on demographics)
– Removed -> terminal state (could become Susceptible again)
S
R I
Location: 2089383 present2 infected1 infectious
Network Dynamics and Simulation Science Laboratory
Effect of Network Structure on Performance*
* Nature May 2004
Network Dynamics and Simulation Science Laboratory
Pandemic Preparedness: Issues
Network Dynamics and Simulation Science Laboratory
Functional Requirements for a Useful IncidenceManagement System
• Specifying a Situation– Multiple complex systems interact via effects on subsytems
– How to represent critical workers? cascading failures? specific incidents?
– Assist reasoning about the situation
• Specifying an Intervention– What does it mean to “close schools” in a pandemic?
– Inhomogeneous society => models adapted to local circumstances
• Structural validity versus Predictability– Stopped watch vs slow watch
– Deduce sign, relative magnitude, and interaction of controls’ effects
Network Dynamics and Simulation Science Laboratory
Considerations in designing effectivecontainment methods
• Choosing the Battlefield– Treat non-human populations– Interrupt cross-species transmission or re-assortment
• May require treating most of the population (backyard poultry) + swine
– Drive new human variants extinct before they are well adapted– Control isolated outbreaks as they occur– Mitigate a full-blown pandemic (30-40% attack rate ~simultaneously)
• Cost Functions– Human suffering averted (referenced to baseline?)– Number of doses of drug required and maximum application rate– Time gained (delay of exponential growth)
– Robustness to mis-specification and resource limitation– Economic impact: immediate and long term
Network Dynamics and Simulation Science Laboratory
Representing & Implementing anIntervention
• Closing schools– When?– What happens to the people at school?– Who else is affected?
• those most likely to spread disease may stay home with sick child.• correlation is inherent in the model, not specified by the modeler
• Evacuating an area– How will it be staged?– What means of transportation?– With whom will people coordinate movements?– Where will congestion appear?
Network Dynamics and Simulation Science Laboratory
Data Mining Challenges to Support PandemicPreparedness
Network Dynamics and Simulation Science Laboratory
Classical Questions in New Settings
• Data Sets– Social Contact Networks are new,
large and evolving– Large number of attributes– Data is being refined continually– New kinds of data sets being
integrated: e.g. data from monitoringstations
• Application Specific Questions– Classification: Classifying who is
vulnerable and critical– Aggregation: Finding regions with high
level of infectivity
– Correlations: Households with childrengoing to a stadium feel sick
• Classical Questions– Classification
– Finding “Nuggets”, correlations– Simple hierarchical aggregation– Spatial Queries
Network Dynamics and Simulation Science Laboratory
Degree distributionof the (temporal)people-ppl. graph
Basic Structural Properties of Portland Social Network
Degree distributions of the bipartite graph
Clustering Coefficient
Network Dynamics and Simulation Science Laboratory
Spatial & Temporal Degree Distributionof People and Activities
Network Dynamics and Simulation Science Laboratory
Example of Cultural Measurements: DegreeDistribution in Labeled Networks
New, fundamental characterization of a realistic social contact network• Partition the population based on some properties (age, income, etc.)• For each part, determine fraction of neighbors in every other part• The measure can only be produced by Simfrastructure-like simulations
Partitionbased on age(Portland)
Network Dynamics and Simulation Science Laboratory
Temporal Degrees by Activity Location Types*
* DIMACS Issue on Epidemiology
Network Dynamics and Simulation Science Laboratory
Structural Robustness of Complex Networks -- Naïve ways toprevent disease-spread
Shattering the socialnetwork by:
Vaccinating (quarantining)high-degree people
Closing downhigh-degree locations
Router Level Networkserving Portland
Network Dynamics and Simulation Science Laboratory
Alternative - Placing Sensors*
• Choose locations to place sensors w/ min. cost• Equivalent to min. dominating set problem
– Minimum locations covering (almost) all people
• Hard to approximate within ratio Θ(log n)
{ {B, C} is a minimum dominating set} is
1234567
B
A
C
People
Locations
ACM SIAM SODA 2004
Network Dynamics and Simulation Science Laboratory
Solution: Early detection and targetedvaccination/quarantining -- Efficient algorithms for Sensor
Placement*
• High degree buildings– do not share too many individuals
• Most people go to atleast one high-degreelocation– Measured by uniformly sampling people
– Sampled people “lead” us to high-degreelocations.
• Not too many High-degree locations
Implication: Simple provable methodfor sensor placementPlace Sensors at high degree nodes < 15 sec.< 15 sec.< 15 sec.Fast-1
> 7 hr.> 5 hr.> 4 hr.Traditional
55.78%54.47%41.23%Traditional
59.60%58.37%47.35%Fast-1
Our RGCLReal
* Eubank et al. Nature 04, SODA 04
Network Dynamics and Simulation Science Laboratory
• Vulnerability– Probability of becoming infected
– Useful measure for weighting vertex statistics
– Depends on where disease starts and spreading dynamics
• Criticality– Expected # infections stopped by removing vertex/edges
• Involve time, initial conditions, topology & dynamics– at time 1, both reduce to function of degree and transmission probabilities
– Easy to compute
• Easily extended to sets of vertices / edges
• General problem is computationally hard, need approximations
• Interventions reduce vulnerability, critical edge/vertices are potential points ofinterventions
An Illustrative Example: Vulnerability versusCriticality
Node 1 is most vulnerable and 7most critical
Network Dynamics and Simulation Science Laboratory
Some Open Questions in Spatial Databasesand Data mining
• Fast Computations of Network Properties,
– E.g. Provable approximation algorithms for computing expansion, betweeness,
community structure
– Properties of time varying graphs, e.g. stochastic mining
• Methods for Storage and Regeneration
– E.g. Can we store the dendogram efficiently so that certain macroscopic
parameters can be computed efficiently
– Use model driven methods for predicting how disease will spread
Network Dynamics and Simulation Science Laboratory
Challenges: A Biased Perspective
Network Dynamics and Simulation Science Laboratory
Data Mining and Simulations
• Simulation based systems have two interwoven steps
– synthesizing and integrating known data (akin to join operations)
– Creating new relations and data as a result of interactions between individualsand infrastructures (achieved via simulations)
• E.g. Tom got infected since he visited a location L and came in spatial proximity ofJack who was sick
• Simulations are a procedural form of representation of large data sets (as opposedto data bases that are extensive form)
• Monitor-Simulate-Mine-Loop
– Mine simulation generated data to actively guide simulations
– Data Mining is thus a part of the simulation based model
Network Dynamics and Simulation Science Laboratory
Example and Illustrative questions
• Social Distancing by reducing contacts– Use Data Mining methods to decide which contacts to reduce
• E.g. can I reduce 30% of everyone’s contacts uniformly, or reduce the overall contactrate by 30% by quarantining some individuals
– Run Simulations to see the outcome– Based on simulations, execute the policy– Observe the outcome and modify the policy
• Incremental “Nuggets”– Instead of finding the most critical set of individuals, we find a partial set and then
incrementally extend the solution
– Variant of Online/Streaming model where information is revealed incrementallyand actions taken modify the future
Co-evolution of Epidemic Policy, Simulation and Mining
Network Dynamics and Simulation Science Laboratory
Public Health Informatics System to SupportPandemic Preparedness
Network Dynamics and Simulation Science Laboratory
Where are we heading
Network Dynamics and Simulation Science Laboratory
Next Generation Simfrastructure: Web and GridEnabled
Network Dynamics and Simulation Science Laboratory
U.S. Proto-Population
• Who: People– 256 million individuals with composable
attributes– Household structure and some
demographics from U.S. Census
• What: Activities– Activity sequences based on data from
national travel survey– Includes travel between cities/countries
• Where: Locations– Disaggregate locations and infrastructure
information– Locations assigned to activity sequences
• Why: Examples– Spread of pandemic disease– Dynamic demand on infrastructures– Situation assessment for emergency response
Network Dynamics and Simulation Science Laboratory
Conclusions and Summary
• Described an Interaction based modeling tool to support pandemic
preparedness and response (Simfrastructure)
– Resolved at the level of individuals and locations
– Individualized Model, allows to represent heterogeneous populations
– Realistic social contact network captures heterogeneous social interaction
– Models for within host and between host transmission
– Aids development and analysis of policies that can be implemented
• Suggests new directions for future research in data mining
– Fast algorithms to analyze and store massive data sets
– Data Mining is a part of the model
Network Dynamics and Simulation Science Laboratory
Related References
1. C. Barrett, S. Eubank, V. Anil Kumar, M. Marathe. Understanding Large Scale Social and InfrastructureNetworks: A Simulation Based Approach, in SIAM news, March 2004. Appears as part of MathAwareness Month on The Mathematics of Networks.
2. C. Barrett, J.P. Smith and S. Eubank. Modern Epidemiology Modeling, in Scientific American, March 2005.
3. C. L. Barrett, R. J. Beckman, K. P. Berkbigler, K. R. Bisset, B. W. Bush, K. Campbell, S. Eubank, K. M.Henson, J. M. Hurford, D. A. Kubicek, M. V. Marathe, P. R. Romero, J. P. Smith, L. L. Smith, P. L.Speckman, P. E. Stretz, G. L. Thayer, E. V. Eeckhout, and M. D. Williams. TRANSIMS: TransportationAnalysis Simulation System. Technical Report LA-UR-00-1725, Los Alamos National LaboratoryUnclassified Report, 2001.
4. S. Eubank, H. Guclu, V.S. Anil Kumar, M. Marathe, A. Srinivasan, Z. Toroczkai and N. Wang, ModelingDisease Outbreaks in Realistic Urban Social Networks, Nature, 429, pp. 180-184, May 2004.
5. S. Eubank, V.S. Anil Kumar, M. Marathe, A. Srinivasan and N. Wang, Structural and Algorithmic Aspectsof Large Social Networks, Proc. 15th ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 711-720,2004.
6. S. Eubank, V.S. Anil Kumar, M. Marathe, A. Srinivasan and N. Wang, Structure of Social ContactNetworks and their Impact on Epidemics. to appear in AMS-DIMACS Special Volume on Epidemiology,2006.
Network Dynamics and Simulation Science Laboratory
Additional References
1. I. Longini, A. Nizam, S. Xu, K. Ungchusak, W. Hanshaoworakal, D. Cummings, E. Hollaran, Science,2005, pp. 1083-1807.
2. I. Longini, E. Hollaran, A. Nizam and Y. Yang, American J. Epidemiology, 2004, pp. 623-633.
3. T. German, K. Kadau, I. Longini, C. Macken, Mitigation Strategies for Pandemic Influenza in theUnited States, PNAS, 2006, 105(15), pp. 5935-5940.
4. N Ferguson, D. Cumming, S. Cauchemez, C. Fraser, S. Riley, A. Meeyai, S. Lamsrrithaworn, D.Burke, Nature, 2005, pp. 209-214.
5. R. Anderson and R. May, Infectious Disease of Humans, Oxford University Press, UK, 1991.6. E.H. Kaplan, D.L. Craft, and L.M. Wein. Proc. Natl. Acad. Sci. USA, 99:10935–10940, 2002.7. L. Meyers, M. E. J. Newman, M. Martin, and S. Schrag. Emerging Infectious Diseases, pages
204–210, 2003.8. M. E. J. Newman. SIAM Review 45, 167–256 (2003).9. R. Albert and A. Barabasi, Rev. Mod. Phys. 74, 47-97 (2002).