1 CHAPTER 6 (handout) Decision Trees. 2 6.1. Introduction Sequential decision making w sequence of...

CHAPTER 6 (handout)Decision Trees

6.1. Introduction

Sequential decision making sequence of chance-dependent decisions presentation of analysis can be complex

Decision Trees Pictorial device to represent problem &

calculations Useful for problems with small no. of sequential

decisions

6.3. Another Decision Tree Ex.

2 boxes, externally identicalMust decide which box a1: box 1: 6 black balls, 4 white balls a2: box 2: 8 black balls, 2 white balls

Correct guess Receive $100 Wrong guess Receive $0

Prior Probability P(1) = 0.5 P(2) = 0.5

Decision Tree

A connected set of nodes & arcs

Nodes: join arcs Arcs: have direction (L to R) Branch: arc & all elements that follow it

2 branches from same initial node cannot have elements in common

2 nodes cannot be joined by > 1 arc

Example of a Decision Tree

A diagram which is not a tree

Types of nodes

Decision point• Choosing next action (branch)

Chance node• Uncontrollable probabilistic event

Terminal node• Specifies final payoff

Example of Sequential Decision Problem

Car Exchange ProblemA person must decide whether to keep or exchange his car in a showroom. There are 2 decisions:

a1: keep cost = 1400 SR

a2: exchange, has 2 possibilities: • good buy P(G) = 0.6 cost = 1200 SR• bad buy P(B) = 0.4 cost = 1600 SR

Good or bad buy can be identified only after buying and using the car. What he should do to minimize his cost?

Car Exchange Problem (no information)

Payoff (Cost) Matrix

P() a1: keep a2: exchange

1: Good 0.6 1400 1200

2: Bad 0.4 1400 1600

EV 1400 1360

Car exchange decision tree

Exchange

G: 0.6

B: 0.4

G: 0.6

B: 0.4

Car exchange decision tree

Exchange

G: 0.6

B: 0.4

G: 0.6

B: 0.4

6.2. A Sequential Test Problem

Car Exchange ProblemAssume the person has 5 options for deciding whether

to keep or exchange his car.

(i) Decide without extra information(ii) Decide on basis of free road (driving) test(iii)Decide after oil consumption test costing $25(iv)Decide after combined road/oil test costing $10(v) Decide sequentially: road test then possibly oil test

costing $10

In (iv), both tests must be takenIn (v), oil test is optional, depending on road test

Car Exchange Problem (with information)

• The decision tree is complicated

• Cannot fit in 1 slide

• 5 branches: 5 options

• Probabilities after extra information are conditional (posterior)

• To illustrate, we choose the branch of option (v)

• Road test then, depending on result, possible oil test costing $10

Car Exchange Problem (with information)

Result of road test:

• y1 : fair p(y1) = 0.5

• y2 : poor p(y2) = 0.5

Result of oil consumption test:

• Z1 : high p(Z1|y)

• Z2 : medium p(Z2 |y)

• Z3 : low p(Z3 |y)

Car exchange decision tree (with information)

y1: 0.5

y2: 0.5

No test

Oil testZ3

No test

Oil testZ3

Road test

Car exchange decision tree with information (y1 branch)

y1: 0.5

No test

Z1: 0.28

Oil test

0.60.4

0.430.57

0.50.5

0.750.25

Z2: 0.24

Z3: 0.48

y1: 0.5

No test

Z1: 0.28

Oil test

0.60.4

0.430.57

0.50.5

0.750.25

Z2: 0.24

Z3: 0.48

y2: 0.5 No test

Z1: 0.32Oil test

0.60.4

0.40.6

0.250. 75

0.250.75

0.310.69

0.570.43

Z2: 0.26

Z3: 0.42

y2: 0.5 No test

Z1: 0.32Oil test

0.60.4

0.40.6

0.250. 75

0.250.75

0.310.69

0.570.43

Z2: 0.26

Z3: 0.42

Decision Tree Calculations

• Tree is developed from left to right

• Calculations are made from right to left

• Many calculation are redundant• For inferior solutions

• Not needed in final solution

• Probabilities after extra information (road or oil tests) are conditional (posterior)

• Calculated by Bayes’ theorem

Initial Payoff Data (no information)

Payoff (Reward) Matrix

P() a1: Box 1 a2: Box 2

1: Box 1 0.5 100 0

2: Box 2 0.5 0 100

EV 50 50

Initial Probability Data (no information)

Prior Probability Matrix

P() B: Black W: White

1: Box 1 0.5 0.6 0.4

2: Box 2 0.5 0.8 0.2

Decision tree without information

1 : 0.5

2: 0.5

1: 0.5

2: 0.5

Decision Tree Example with information

Samples from box can be taken Ball is returned to the box Up to 2 samples are allowed Cost = $3 per sample

What is the optimal plan?

Posterior probabilities for sample 1

Probability Calculations

P() P(B) P(W) Joint Posterior

1: 0.5 0.6 0.4 0.3 0.2 0.43 0.67

2: 0.5 0.8 0.2 0.4 0.1 0.57 0.33

1.0 0.7 0.3 1.00 1.00

Decision tree with information

No sample

Sample 1

B: 0.7

W: 0.3

a1 or a2

Sample 2

No sample

No information

Posterior probabilities for sample 2when sample 1 is Black

1: 0.43 0.6 0.4 0.26 0.17 0.36 0.61

2: 0.57 0.8 0.2 0.46 0.11 0.64 0.39

1.0 0.72 0.28 1.00 1.00

Sample 1 Black, No Sample 2

No 2nd sample

Sample 2

1: 0.43

2: 0.57

B: 0.72

W: 0.28

1: 0.43

2: 0.57

a2Black sample 1

Samples 1 & 2 Both Black

Black sample 2

Sample 2

1: 0.36

2: 0.64

B: 0.72

W: 0.28

1: 0.36

2: 0.64

Black sample 1

No Sample

Sample 1 Black, Sample 2 White

White sample 2

Sample 2

1: 0.61

2: 0.39

W: 0.28

B: 0.72

1: 0.61

2: 0.39

Black sample 1

No Sample

Posterior probabilities for sample 2when sample 1 is White

1: 0.67 0.6 0.4 0.40 0.27 0.61 0.79

2: 0.33 0.8 0.2 0.26 0.07 0.39 0.21

1.0 0.66 0.34 1.00 1.00

Sample 1 White, No Sample 2

No 2nd sample

Sample 2

1: 0.67

2: 0.33

B: 0.66

W: 0.34

1: 0.67

2: 0.33

a2White sample 1

Sample 1 White, Sample 2 Black

Black sample 2

Sample 2

1: 0.61

2: 0.39

B: 0.66

W: 0.34

1: 0.61

2: 0.39

White sample 1

No Sample

Samples 1 & 2 Both White

White sample 2

Sample 2

1: 0.79

2: 0.21

W: 0.34

B: 0.66

1: 0.79

2: 0.21

White sample 1

No Sample

Decision tree summary of results

No samples

Sample 1

B: 0.7

W: 0.3

a1 or a2

Sample 2

No 2nd sample

No information

Sample 2

No 2nd sample

W, 0.28: a1

B, 0.72: a2

B, 0.66: a1

W, 0.34: a1

a1: 6B, 4W

a2: 8B, 2W

Decision Tree with Fixed Costs

Example of fixed cost: • sampling cost = 3/sample in previous example

If objective is to maximize expected payoff, Constant costs can be deducted either from:

• Terminal node payoffs

• Expected values

Example: Including fixed costs

Sample 1 Black, cost = $3

1: 0.43

2: 0.57

43 – 3a1

1: 0.43

2: 0.57

$– 3

Recall Slide 9

Fixed Costs & Utilities

Utilities can be used instead of payoffs If objective is to maximize expected utility

• Constant costs must be deducted from terminal node payoffs

• Net payoffs are converted to net utilities• Expected values are taken of utilities of net

payoffs

Including fixed costs

1: 0.43

2: 0.57

U(100)

EU–U(3)a1

1: 0.43

2: 0.57

U(– 3)

Incorrect

Correct

Allowing an optional 3rd sample

Suppose now a 3rd sample is allowed Sample cost = $3 Assume the decision whether or not to

take sample 3 depends on results of samples 1 and 2

Posterior probabilities for sample 3

After 2 blacks (slide 8)

1: 0.36 0.6 0.4 0.22 0.14 0. 3 0.52

2: 0.64 0.8 0.2 0.51 0.13 0. 7 0.48

1.0 0.73 0.27 1.00 1.00

Decision tree with optional sample 3

Sample 1 B: 0.7

W: 0.3

Sample 2

No 2nd sample

No sample

Sample 3

No 3rd sample

Sample 2

No 2nd sample

Sample 3

No 3rd sample

Fixing the number of samples

Suppose now a 3rd sample is allowed Sample cost = $3 Assume we must decide the number of samples

in advance:

0, 1, 2, or 3

Zero samples

a1: Box 1

1 : 0.5

2: 0.5

1: 0.5

2: 0.5

$50a2: Box 2

50No samples

One Sample

B: 0.7

W: 0.3

1: 0.43

2: 0.57

1: 0.43

2: 0.57

Sample once

1: 0.67

2: 0.33

1: 0.67

2: 0.33

Posterior probabilities for 2 samples

Examples: P(BB|1) = P(BB) = 0.6(0.6) = 0.36

P(BW|1) = P(BW) + P(WB) = 0.6*0.4 + 0.4*0.6 = 0.48

P(WW|1) = P(WW) = 0.4(0.4) = 0.16

P() BB BW WW Joint

1: 0.5 0.36 0.48 0.16 0.18 0.24 0.08

2: 0.5 0.64 0.32 0.04 0.32 0.16 0.02 0.50 0.40 0.10

1: 0.36 0.60 0.80

2: 0.64 0.40 0.20

Two Samples

BB: 0.5

WW: 0.1

1: 0.36

2: 0.64$94

Sample twice

1: 0.36

2: 0.64$-6

1: 0.6

2: 0.4$94

541: 0.6

2: 0.4$-6

1: 0.8

2: 0.2$94

741: 0.8

2: 0.2$-6

BW: 0.4

Posterior probabilities for 3 samplesP(BBB|1) = 0.6(0.6)(0.6) = 0.216

P(BBW|1) = P(BBW) + P(BWB) + P(WBB)= 3*0.6*0.6*0.4 = 0.432

P(BWW|1) = P(BWW) + P(WBW) + P(WWB)= 3*0.6*0.4*0.4 = 0.288

P(WWW|1) = 0.4(0.4)(0.4) = 0.064

P BBB BBW BWW WWW Joint

1:0.5 0.216 0.432 0.288 0.064 0.108 0.216 0.144 0.032

2:0.5 0.512 0.384 0.096 0.008 0. 256 0.192 0.048 0.004

0.364 0.408 0.192 0.036Post

1: 0.30 0.53 0.75 0.89

2: 0.70 0.47 0.25 0.11

Three Samples

BBB: 0.36

WWW: 0.04

1: 0.3

2: 0.7$91

Sample 3 times

1: 0.3

2: 0.7$-9

BBW: 0.41

1: 0.53

2: 0.47$91

441: 0.53

2: 0.47$-9

1: 0.75

2: 0.25$91

661: 0.75

2: 0.25$-9

1: 0.89

2: 0.11$91

801: 0.89

2: 0.11$-9$91

BWW: 0.19

Summary of results with fixed number of samples

1 sample

0 samples

2 samples

3 Samples

Value of Sample (new information Results of previous example

• With sequential samples (slide 23)

• With fixed no. of samples (slide 31)

3rd Sample is never needed Questions:

• How many samples should be taken?

• Is it better to decide immediately or after more information?

Expected Value of Information

Assume P(1) = p, P(2) = 1 – p ThenP() P(B) P(W) Joint

1:p 0.6 0.4 0.6p 0.4p

2:1–p 0.8 0.2 0.8(1-p) 0.2(1-p) 1.0 (4-p)/5 (1+p)/5

Posterior3p/(4-p) 2p/(1+p)

4(1-p)/(4-p) (1-p)/(1+p)

Expected payoff Best payoff if Black = 100[ max{3p/(4-p), 4(1-p)/(4-p)} ] Best payoff if White = 100[ max{2p/(1+p), (1-p)/(1+p)} ]

Expected outcome F(p) = 100 (4-p)/5 [ max{3p/(4-p), 4(1-p)/(4-p)} ]

+ 100 (1+p)/5[ max{2p/(1+p), (1-p)/(1+p)} ]

F(p) = 100[ max{0.6p, 0.8(1-p)} + max{0.4p, 0.2(1-p)} ] F(p) = max{60p, 80(1-p)} + max{40p, 20(1-p)} F(p) = max{a, b} + max{c, d}

Graph of expected payoff

4/71/3

Maximum Expected Payoff

To maximize F(p) on 0 < p < 1, Graphical solution gives

• 0 < p < 1/3 F(p) = 100(1 – p) b + d

• 1/3 < p < 4/7 F(p) = 80 – 40p b + c

• 4/7 < p < 1 F(p) = 100p a + c

For 1st and 3rd ranges, solution is same as expected payoff given only P(1) = p, P(2) = 1 – p.

Only 2nd range has improvement in expected payoff Sample should be taken only if: 1/3 < p < 4/7

Expected Value of Sample Information Value of sample information

= Expected improvement in payoff

= 80 – 40p – (100 – 100p), 0 < p < 0.5

= 80 – 40p – (100p), 0.5 < p < 1

= 60p – 20, 0 < p < 0.5

= 80 – 140p, 0.5 < p < 1

Range of p for sample cost = 3

For sample cost = 3 Sample should be taken only improvement is > 3

• 60p – 20 > 3• p > 0.383

• 80 – 140p > 3• p < 0.55

Thus, 0.383 < p < 0.55

For fixed no. of samples

Posteriors after 2 samples (slide 27)

BB BW WW

P(1) = p 0.36 0.60 0.80

Since all probabilities are outside the range

(0.383 < p < 0.55)

A 3rd sample should not be taken

How many samples?

So far, analysis is for the value of 1 sample We can estimate value of several samples

Max. no. of samples• Expected payoff with no information = 50 • Payoff with perfect information = 100• Max. no. of samples = (100 – 50)/3 = 16

1 CHAPTER 6 (handout) Decision Trees. 2 6.1. Introduction Sequential decision making w sequence of...

Documents

Adversarial Attacks in Sequential Decision Making and Control

Sequential Expected Utility Theory: Sequential Sampling in Economic Decision Making under Risk

Sequential Decision Making with Adaptive Utility

Transfer Learning in Sequential Decision Problems: A

Robust and Interpretable Sequential Decision-Making for

Sequential Decision Making in Medicinebtdenton.engin.umich.edu/wp-content/uploads/sites/138/...Sequential Decision Making in Medicine: MDPs, POMDPs, and Beyond… CHOIR Healthcare

Robotic Coverage and Exploration as Sequential Decision

Sequential Decision Making

SEQUENTIAL DECISION PROBLEMS, DEPENDENT TYPES AND … · 2. Sequential decision processes and problems In a nutshell, a sequential decision process is a process in which a decision

Decentralized sequential decision making with asynchronous ... · Decentralized sequential decision making with asynchronous communication Georgios Fellouris We consider three statistical

Sequential Decision Problems · Sequential decision problems in known, accessible, deterministic domains tools— search algorithms outcome— sequence of actions that leads to good

Learning Sequential Decision Rules Using Simulation … · Learning Sequential Decision Rules Using Simulation ... Sequential decision tasks may be characterized by the following

Migration of Graduates within a Sequential Decision ...repec.iza.org/dp8573.pdf · Migration of Graduates within a . Sequential Decision Framework: Evidence from Poland . Mikołaj

A Hierarchical Architecture for Sequential Decision-Making ... · A Hierarchical Architecture for Sequential Decision-Making in Autonomous Driving using Deep Reinforcement Learning

Four logic structures: 1.Sequential structure 2.Decision ... · PDF fileFour logic structures: 1.Sequential structure. 2.Decision structure. 3.Loop structure. 4.Case structure.

Bayesian Optimisation: Sequential Decision Making Under

Decision Theory: Sequential Decisions

SEQUENTIAL INFORMATION ACQUISITION AND DECISION …

Sequential Three-way Decision with Probabilistic Rough Sets

Backdoor Attacks in Sequential Decision-Making Agents