49
Dynamic Decision Problems: Hybrid Use of Decision Trees & System Dynamics Models Nathaniel Osgood (with some slides courtesy of Karen Yee) CMPT 858 March 24, 2011 1

Dynamic Decision Problems: Hybrid Use of Decision … · Dynamic Decision Problems: Hybrid Use of Decision Trees & System Dynamics Models ... – Strategy selection

  • Upload
    haque

  • View
    236

  • Download
    0

Embed Size (px)

Citation preview

Dynamic Decision Problems:Hybrid Use of Decision Trees &

System Dynamics Models

Nathaniel Osgood

(with some slides courtesy of Karen Yee)

CMPT 858

March 24, 2011 1

Local Context

• Saskatchewan suffered the highest

incidence of WNV in Canada in 2003 and

2007

• Saskatoon Health Region (SHR) reported

6.5% and 25% of the provincial cases in

2003 and 2007, respectively

2

Mosquito species Culex tarsalis primarily responsible for spreadingWNV in Saskatchewan

3Source: Penn State University

MosquitoLifecycle

Source:www.azstarnet.com/metro/295104

Source: unknown

Adult

Source:www.comosquitocontrol.com/Mosquito_Biology.html

Pupa

Source:http://www.flickr.com/photos/lordv/207198441/

Egg Raft

Larvae

Terrestrial

Aquatic

4

Mosquito EnvironmentalDependencies

For increasing their numbers:

• Temperature (Average number of nighttemperatures above 15ºC; heat accumulated days)

• Habitat availability

• Rainfall

5

Human Dependencies

For using protective measures:

• Perceived risk

• Knowledge of WNV

• Temperature

6

Health Managers’ Dilemma

Need to make decisions now taking into accountuncertainties regarding:

1. Mosquito population• abundance

• WNV prevalence

2. Environmental conditions• current

• forecasted

3. Human behaviour

7

Different Levels of Challenge inDynamic Decision-Making

• Type A: Making complex dynamic choices given someexpected/typical course of important factors outsideour control– Here, the focus is centred on building models that help us

understand the complex impact of our choices given this‘expected course’

– Tough

• Type B: Making complex dynamic choices when wecan’t anticipate the course of the important factorsoutside our control– Focus on both dynamic model and adaptive planning given

uncertainty

– Tougher8

Implications

• Type A: When important exogenous conditions areknown, we often seek to identify & stick to an‘optimal’ pre-set plan

– Don’t have to worry much about unfolding externalconditions – they are known or unimportant

Type B: Rather than putting “all our eggs in one basket”, itis typically best to avoid a pre-set plan, and instead toadaptively make choices over time

– What we will do over time will depend on what isobserved

9

Decision making under dynamicuncertainty: Adaptive Planning

• Type A: When important exogenous conditions areknown, we often seek to identify & stick to an‘optimal’ pre-set plan

– Don’t have to worry much about unfolding externalconditions – they are known or unimportant

Type B: Rather than putting “all our eggs in one basket”,we typically seek to avoid a pre-set plan, and instead toadaptively make choices over time

– What we will do over time will depend on what isobserved

10

The presentation focuses on this typeof “dynamic decision” problems

Adaptive Decision Problems: RelevantQuestions

How do we make decisions now, when the choice ofthe best decision depends so much on what playsout (unfolds) in things beyond our control?

– Temperature trends

– Precipitation

– Prevalence of infection in migratory bird populations

Should we make our decisions now despite theseuncertainties? Or should we wait to see how thingsare trending before making decisions?

11

Characteristics of “Adaptive Decision”Problems

• Can’t count on one particular future trajectoryunfolding for things outside our control

• Choosing decisions now requires considering thedifferent possibilities of what might unfold in thefuture

• We must make decisions over time, as we observethings unfold

– The later we wait, the more information we’ll have

• It may be advantageous to decide to “wait and see”as to how things play out until a later decision point12

In these Conditions… What decision we make at a particular point in time will

depend on

Our current situation

What we’ve observed as happening to this point (what we’ve “learned” –e.g. recent levels of growth)

State (as given by stocks & derived quantities)

Possible future eventualities, in light of what we’ve already seen(e.g. future levels of growth, given recent growth)

Our possible decision points in the future

• Here, we are balancing two desires:• To “seize the moment” and act early

• To “wait and see” what happens, and decide on the basis of this

13

A Hybrid System Architecture toAddress these “Tougher” Problems

14

SimulationModel

Decision Tree

Events & Decisions

Consequences

Introduction to Decision Trees

• We will use decision trees both for– Diagrammatically illustrating decision making

w/uncertainty

– Quantitative reasoning

• Represent– Flow of time

– Decisions

– Uncertainties (via events)

– Consequences (deterministic or stochastic)

Decision Tree Nodes

• Decision (choice) Node

• Chance (event) Node

• Terminal (consequence) node

Time

Identifying the Optimal Decision Rule

• To select decision rules, we perform a“rollback” of the tree (dynamic programming)– For terminal nodes, pass up value

– For event nodes, pass up expected value ofchildren

– For decision nodes, select whichever child offershighest value and pass up that value for this node

Example TreeFeature Decision Making

Best Option

Extended Example

Extended Rule

Decision Rules

• Decision trees can be used to identify“optimal” decision rules

– Remember: Optimality is in light of (simplified)assumptions!

• A decision rule specify what we should dogiven any possible eventuality

Decision Tree To Structure Policy Space

High later growth

PLConsequence1

Low later growth

1-PLConsequence2

Expand capacity

High later growth

PLConsequence3

Low later growth

1-PLConsequence4

No expansion

High early growth

PE

High later growth

PLConsequence5

Low later growth

1-PLConsequence6

Expand capacity

High later growth

PLConsequence7

Low later growth

1-PLConsequence8

No expansion

Low early growth

1-PE

Expand capacity

High later growth

PLConsequence9

Low later growth

1-PLConsequence10

Expand capacity

High later growth

PLConsequence11

Low later growth

1-PLConsequence12

No expansion

High early growth

PE

High later growth

PLConsequence13

Low later growth

1-PLConsequence14

Expand capacity

High later growth

PLConsequence15

Low later growth

1-PLConsequence16

No Expansion

Low early growth

1-PE

No expansion

Initial Decision

Time

Decision node

Event node

TerminalNode(Consequencefor Scenario)

Scenario

Terminology

• A static decision rule pursues the samepredetermined decisions (actions) regardless ofeventualities

• An adaptive decision rule varies its decisions(actions) based on which events have occurred

• Observation: Static decision rules are rarelyoptimal

High later growth

PLConsequence1

Low later growth

1-PLConsequence2

Expand capacity

High later growth

PLConsequence3

Low later growth

1-PLConsequence4

No expansion

High early growth

PE

High later growth

PLConsequence5

Low later growth

1-PLConsequence6

Expand capacity

High later growth

PLConsequence7

Low later growth

1-PLConsequence8

No expansion

Low early growth

1-PE

Expand capacity

High later growth

PLConsequence9

Low later growth

1-PLConsequence10

Expand capacity

High later growth

PLConsequence11

Low later growth

1-PLConsequence12

No expansion

High early growth

PE

High later growth

PLConsequence13

Low later growth

1-PLConsequence14

Expand capacity

High later growth

PLConsequence15

Low later growth

1-PLConsequence16

No Expansion

Low early growth

1-PE

No expansion

Initial Decision

A Static Decision RuleObservation:The consequencesobserved at aparticular terminalnode are a function ofthe associatedscenario (particularsequence ofdecisions and eventson the path leadingto that terminal node)– and are the sameregardless as to whichdecision rule that givesrise to this sequence

High later growth

PLConsequence1

Low later growth

1-PLConsequence2

Expand capacity

High later growth

PLConsequence3

Low later growth

1-PLConsequence4

No expansion

High early growth

PE

High later growth

PLConsequence5

Low later growth

1-PLConsequence6

Expand capacity

High later growth

PLConsequence7

Low later growth

1-PLConsequence8

No expansion

Low early growth

1-PE

Expand capacity

High later growth

PLConsequence9

Low later growth

1-PLConsequence10

Expand capacity

High later growth

PLConsequence11

Low later growth

1-PLConsequence12

No expansion

High early growth

PE

High later growth

PLConsequence13

Low later growth

1-PLConsequence14

Expand capacity

High later growth

PLConsequence15

Low later growth

1-PLConsequence16

No Expansion

Low early growth

1-PE

No expansion

Initial Decision

An Adaptive Decision RuleObservation:The consequencesobserved at aparticular terminalnode are a function ofthe associatedscenario (particularsequence ofdecisions and eventson the path leadingto that terminal node)– and are the sameregardless as to whichdecision rule that givesrise to this sequence

Analysis Using Decision Trees

• Decision trees are a powerful analysis tool

• Addition of symbolic components to decisiontrees greatly expand power

• Example analytic techniques

– Strategy selection

– One-way and multi-way sensitivity analyses

– Value of information

Decision Tree w/Variables

Risk Preference• People are not indifferent to uncertainty

– Lack of indifference from uncertainty arises fromuneven preferences for different outcomes

– E.g. someone may

• dislike losing $x far more than gaining $x

• value gaining $x far more than they disvalue losing $x.

• Individuals differ in comfort with uncertaintybased on circumstances and preferences

• Risk averse individuals will pay “riskpremiums” to avoid uncertainty

Risk Preference(Decision Tree Preview)

Categories of Risk Attitudes

• Risk attitude is a general way of classifying riskpreferences

• Classifications– Risk averse fear loss and seek sureness

– Risk neutral are indifferent to uncertainty

– Risk lovers hope to “win big” and don’t mindlosing as much

• Risk attitudes change over– Time

– Circumstance

Preference Function

• Formally expresses a particular party’s degreeof preference for (satisfaction with) differentoutcomes ($, time, level of conflict, quality…)

• Can be systematically derived

• Used to identify best decision when haveuncertainty with respect to consequences

– Choice with the highest mean preference is thebest strategy for that particular party

Challenge:Identify these Preference Functions

• (On the Board)

Risk Attitude in Preference Fns

Identifying Preference Functions

• Simple procedure to identify utility valueassociated with multiple outcomes

• Interpolation between these data pointsdefines the preference function

Notion of a Risk Premium• A risk premium is the amount paid by a (risk

averse) individual to avoid risk

• Risk premiums are very common – what aresome examples?– Insurance premiums

– Higher fees paid by owner to reputablecontractors

– Higher charges by contractor for risky work

– Lower returns from less risky investments

– Money paid to ensure flexibility as guard againstrisk

CertaintyEquivalentExample

• Consider a risk averse individual withpreference fn f faced with an investment cthat provides– 50% chance of earning $20000– 50% chance of earning $0

• Average money from investment =– .5*$20,000+.5*$0=$10000

• Average satisfaction with the investment=– .5*f($20,000)+.5*f($0)=.25

• This individual would be willing to trade fora sure investment yielding satisfaction>.25instead– Can get .25 satisfaction for a sure f-1(.25)=$5000

• We call this the certainty equivalent to theinvestment

– Therefore this person should be willing to tradethis investment for a sure amt of money>$5000

.25

Mean valueOf investment

Mean satisfaction withinvestment

Certainty equivalentof investment

$500

0

.50

Example Cont’d (Risk Premium)• The risk averse individual would be willing to

trade the uncertain investment c for anycertain return which is > $5000

• Equivalently, the risk averse individual wouldbe willing to pay another party an amount rup to $5000 =$10000-$5000 for other less riskaverse party to guarantee $10,000

– Assuming the other party is not risk averse, thatparty wins because gain r on average

– The risk averse individual wins b/c more satisfied

Certainty Equivalent• More generally, consider situation in which have

– Uncertainty with respect to consequence c

– Non-linear preference function f

• Note: E[X] is the mean (expected value) operator

• The mean outcome of uncertain investment c is E[c]

– In example, this was .5*$20,000+.5*$0=$10,000

• The mean satisfaction with the investment is E[f(c)]

– In example, this was .5*f($20,000)+.5*f($0)=.25

• We call f-1(E[f(c)]) the certainty equivalent of c

– Size of sure return that would give the same satisfaction asc

– In example, was f-1(.25)=f-1(.5*20,000+.5*0)=$5,000

Risk Attitude Redux

• The shapes of the preference functions meanscan classify risk attitude by comparing thecertainty equivalent and expected value

– For risk loving individuals, f-1(E[f(c)])>E[c]

– For risk neutral individuals, f-1(E[f(c)])=E[c]

– For risk averse individuals, f-1(E[f(c)])<E[c]

Motivations for a Risk Premium• Consider

– Risk averse individual A for whom f-1(E[f(c)])<E[c]

– Less risk averse party B

• A can lessen the effects of risk by paying a riskpremium r of up to E[c]-f-1(E[f(c)]) to B inreturn for a guarantee of E[c] income

– The risk premium shifts the risk to B

– The net investment gain for A is E[c]-r, but A ismore satisfied because E[c] – r > f-1(E[f(c)])

– B gets average monetary gain of r

Multiple Attribute Decisions

• Frequently we care about multiple attributes

– Cost

– Time

– Quality

– Relationship with owner

• Terminal nodes on decision trees can capturethese factors – but still need to make differentattributes comparable

WNV Hybrid ApproachSD Model Decision Tree

User Interface

44

Susceptible Humans

Asymptomactically Infected Humans

Deaths ofSusecptible Humans

Deaths fromAsymptomacticallyInfected Humans

Entrants ofHumans

Per Bite Probability oftransmission from infected

mosquito to human

<Number of bitings ofsusceptible humans by infected

mosquitoes per day>

Recruitment rate ofsusceptible humans

WNV-induced deathrate for humans

Vaccinated Humans

Vaccination

Loss Immunity ofVaccinated Humans Vaccine Rate

Rate of Loss Immunityof Vaccinal Humans

Humans

Hospitalized WNV Patients

Non-Hospitalized WNF Patients UnderRecovery

Discharge of WNFCases from Hospital

Recovered and WNV ImmunePatients

Recovery ofWNF Patients

Effects to be Factored in: Season/TemperatureDependence of Mosquito biting preferences

(presumably through baby bird depletion). Late-seasonmosquitoes heading into Burrows to hibernate (this being

more driven by length of day)

Mean Time in Hospitalfor WNF Patients

Loss of Immunity of Recovered andWNV Immune Patients

Deaths of WNFPatients under

Recovery

Mean Time toWaning Immunity

Mean Time toRecover for

PostHospital WNFPatients

<Mean Time toWaning Immunity>

Recovery ofAsymptomatically Infected

Patients

CumulativeWNVSymptomaticCases

New SymptomaticCases

InterventionSelected

Average lifespanof a human

<Average lifespan ofa human>

<Average lifespan ofa human>

<Average lifespan ofa human>

Mean Time to Recover forAsymtomatically Infected

Patients

Exposed Humans

Newly InfectedPre-symptomatic Human

Cases

AsymptomaticInfection

Mean Time Until WNVIncubates in Humans

Number of Human Cases perDay Completing WNV

Incubation

Progression to Non-Hospitalized WNF

Hospitalized forWNF

<Fraction of ExposedHumans that remain

asymptomatic>

<Fraction of Exposed Humanswith NonHospitalized WNF>

<Fraction of ExposedHumans with WNF that

are hospitalized>

<Progression toNon-Hospitalized

WNF>

<Hospitalized forWNF>

Force of Infectionfor Humans

NegativeOfCumulativeWNVSymptomaticCases

Deaths of Recovered andWNV Immune Patients

<Average lifespan ofa human>

Deaths of HospitalizedWNV Patients

Deaths due toWNV

<Average lifespan ofa human>

Deaths of ExposedHumans

<Average lifespan ofa human>

<Number of Human Casesper Day Completing WNV

Incubation>

Deaths ofVaccinated

Humans

The Hybrid Approach: Critical Points

45

1. Is a framework geared toward an ongoing process ofobservation & decision making

2. Captures uncertainties as time progresses

3. Simulates a broad range of possibilities (e.g. for temperature)and not just a single scenario

4. Allows for staging of decisions over different time points –including decisions to “wait & see” (exploiting future options)

5. Could be used for diverse planning challenges (e.g. H1N1 givenuncertainty regarding public reaction, vaccine availability)

Responsibilities in the HybridApproach

46

Simulation Model Decision Tree

• Calculates dynamic consequences ofa sequence over time of

• Events• Choices

• Takes care of deterministic simulationgiven events & decisions

• Represents over time possiblesequences of

• Uncertainties (event nodes)• Decisions (decision nodes)

• Consequences (outcomes – e.g.Cost, quality of life, etc.)• Takes care of encapsulating

• Capturing all uncertainties• “policy space” – where policies

are made over time

Example: WNV Hybrid Approach

47

Simulation Model Decision Tree

• Mosquito lifecycle (includes temperatureeffects)

• Bird lifecycle• Transmission between mosquitos & bird• Human infection & disease progression• Future: costs & resource use (via resource

intensity weights, lengthof stay), quality of life

• Decision options over time (sourcereduction, larvaciding, vaccination,wait & see)

• Uncertainties (temperature)• Consequences (all WNV cases, severe

neurological cases, costs, etc.)

West NileVirus SDModel

48

Susceptible AdultFemale Mosquitoes

EndogeneouslyCalculated Infectious

Mosquitoes

Death of SusceptibleAdult FemaleMosquitoes

Death of InfectiousMosquitoes

Larval FemaleMosquitoes

Exposed AdultFemale Mosquitoes

Death for LarvalFemale Mosquitoes

Death of ExposedFemale AdultMosquitoes

Natural Death Rate ofLarval FemaleMosquitoes

Mean Time toLarval Maturation

Maturation ofLarvae

Pathogen TransmissionFrom Infected Bird to

Susceptible Mosquitoes

Disease Incubation

Virus IncubationRate

Susceptible Humans

Asymptomactically Infected Humans

Deaths ofSusecptible Humans

Deaths fromAsymptomacticallyInfected Humans

Entrants ofHumans

Per Bite Probability oftransmission from infected

mosquito to human

<Number of bitings ofsusceptible humans by infected

mosquitoes per day>

Recruitment rate ofsusceptible humans

WNV-induced deathrate for humans

Death rate for AdultMosquitoes

<Death rate for AdultMosquitoes>

<Death rate for AdultMosquitoes>

Vaccinated Humans

Vaccination

Loss Immunity ofVaccinated Humans Vaccine Rate

Rate of Loss Immunityof Vaccinal Humans

Mosquitoes

Humans

Hospitalized WNV Patients

Number of adultblood meals per

day

Non-Hospitalized WNF Patients UnderRecovery

Discharge of WNFCases from Hospital

Recovered and WNV ImmunePatients

Recovery ofWNF Patients

Egg Laying Rate for AdultFemale Mosquitoes

Eggs

<Total Adult FemaleMosquitoes>

Egg Laying by AdultFemale Mosquitoes

Density

Mean Time as Egg

CurrentTemperatureInCentigrade

Effects to be Factored in: Season/TemperatureDependence of Mosquito biting preferences

(presumably through baby bird depletion). Late-seasonmosquitoes heading into Burrows to hibernate (this being

more driven by length of day)

Pupae FemaleMosquitoes

Maturation ofPupae

Mean Time toPupal

Maturation

Mean Time in Hospitalfor WNF Patients

Loss of Immunity of Recovered andWNV Immune Patients

Deaths of WNFPatients under

Recovery

Mean Time toWaning Immunity

Mean Time toRecover for

PostHospital WNFPatients

<Mean Time toWaning Immunity>

Mean Time to LarvalMaturation for Given

Temperature

<CurrentTemperatureInCentigrade>

Virus IncubationThreshold

Temperature

Recovery ofAsymptomatically Infected

Patients

InterventionSelected

AdulticidingDeath Rate

Natural death rate forAdult Mosquitoes

Source Reduction DeathRate of Larval Female

Mosquitoes

Larvacide Reduction DeathRate of Larval Female

Mosquitoes

Death Rate for LarvalFemale Mosquitoes

Average lifespanof a human

<Average lifespan ofa human>

<Average lifespan ofa human>

<Average lifespan ofa human>

Mean Time to Recover forAsymtomatically Infected

Patients

<InterventionSelected>

Death for PupalFemale Mosquitoes

Natural Death Rate ofPupal FemaleMosquitoes

Natural Death Rate ofImmature Female

Mosquitoes

Larvacide Reduction DeathRate of Pupal Female

Mosquitoes<InterventionSelected>

Source Reduction DeathRate of Pupal Female

Mosquitoes

Source Reduction DeathRate of Immature Female

Mosquitoes

Death rate for PupalFemale Mosquitoes

Exposed Humans

Newly InfectedPre-symptomatic Human

Cases

AsymptomaticInfection

Mean Time Until WNVIncubates in Humans

Number of Human Cases perDay Completing WNV

Incubation

Progression toNon-Hospitalized WNF

Hospitalized for WNF& Neurological

Symptoms

<Fraction of ExposedHumans that remain

asymptomatic>

<Fraction of Exposed Humanswith NonHospitalized WNF>

<Fraction of ExposedHumans with WNF that

are hospitalized>

Number of adult bloodmeals per day for Given

Temperature

Egg Laid perBlood Meal

Birth Coefficient

Force of Infectionfor Humans

Deaths of Recovered andWNV Immune Patients

<Average lifespan ofa human>

Larvacide Reduction DeathRate of Immature Female

Mosqutioes

Deaths of HospitalizedWNV Patients

Deaths due toWNV

<Average lifespan ofa human>

Deaths ofExposedHumans

<Average lifespan ofa human>

Deaths ofVaccinated

Humans

<Fraction of ExposedHumans with Neurological

Symptoms that arehospitalized>

Egg to MosquitoLarva Ratio

Eggs Hatching toMosquito Larvae

Female Eggs Hatchingto Mosquito Larvae

ExtrinsicIncubation Period

<Disease Transmission toSusceptible Mosquitoes Through

Contact with Infectious AdultBirds (AB to M)>

<Disease Transmission toSusceptible Mosquitoes ThroughContact with Infectious Juvenile

Birds (JB to M)>

back

Decision Tree

49

Current temp = 20oC

Current temp = 30oC

Adulticiding = 3

Current temp = 20oC

Current temp = 30oC

Larvaciding = 2

Current temp = 20oCAdulticiding = 3

Current temp = 20oC

Current temp = 30oC

Larvaciding = 2

Larvaciding = 2

Do Nothing = 0

Current temp = 20oC

Current temp = 30oC

Larvaciding = 2

Do Nothing = 0

Current temp = 20oC

Current temp = 30oC

Current temp = 20oC

Current temp = 20oC

Current temp = 30oC

Larvaciding = 2

Do Nothing = 0

Current temp = 15oC

Current temp = 20oC

Current temp = 15oC

Current temp = 30oC

Current temp = 20oC

Current temp = 30oC

Current temp = 20oC

Current temp = 30oC

Current temp = 30oC

Current temp = 20oC

Current temp = 30oC

Current temp = 20oCDo Nothing = 0

Larvaciding = 2

Do Nothing = 0

Time (weeks)Week 1 Week 2

Larvaciding = 2

Adulticiding = 3

Source Reduction = 1

…. same as below

…. same tree structure as the “do nothing”branch

…. same tree structure as the “donothing” branch

back