Robotic Rational Reasoning! Lecture 2 P.H.S. Torr

Robotic Rational Reasoning! Lecture 2

P.H.S. Torr

Summary of last lecture

• We talked about the problem of induction and how it might relate to AI.

• We saw probability might be the basis of reasoning about the world.

• We saw there were problems with the attempts to make objective interpretations of probability.– Venn– Von Mises (frequentists)– Popper (propensity)

This lecture

• We will develop a system of probability that can be used within AI.

• We shall show how this can be used to reason about events and decide actions.– How to represent our belief in propositions by numbers– How to make action on the basis of those beliefs

• Once we have done this we want to show how to act rationally, so as to optimize utility.

Belief

• Why is belief important?

• Consider again the example of the coin hidden in one of my hands.

• Your belief in which hand it is in, is different from mine.

• Can we make this more formal so that we can use belief in practise?

Quantifying Belief

• We talked about ‘stronger’ and ‘weaker’ belief; and being ‘more or less confident’. Can we make sense of this talk?

• That is, can we quantify partial belief, and give a numerical basis to these comparative judgements? – Assign numbers to measure the strength of a belief; – Prove that those numbers must be, formally, probabilities; and – Provide a rule for updating beliefs given new evidence.

• These three achievements constitute Bayesian epistemology.

Task 1: Measuring Belief

• We turn first to the task of assigning numbers to measure belief.

• Is it possible to measure belief in some rational way?

Theories of Belief

• Logical – Keynes, Carnap, Cox, Jaynes

• Preference over actions– Ramsey, Savage

• Betting– De Finetti

Logical theory of belief

• Here logic is extended from binary valued to include uncertainty over propositions.

• This theory was first put forward by economist Keynes and then given rigour by R.T. Cox.

• It is convincing but difficult maths.

Subjective theory of belief based on Actions

• By Ramsey and Savage.

• From a person expressing preferences between actions one can derive the whole of probability theory, a remarkable feat.

• Again highly mathematical.

Gambles and Degrees of Belief

• Probability has a long association with games of chance.

• One way that we have talked about probabilities was in terms of rolls of die and tosses of coins.

• A better way to represent degrees of belief will be in terms of gambles.

– A gamble is simply a choice between two possibilities or options.

– The idea is to:

» Assign cash payoffs to options, and then

» Assign numbers to which of options we think is more likely.

Example of a Gamble

• Setup:

– First we need to list our options:

» B = The US will bomb North Korea by June 30, 2007.

» ~B = The US will not bomb North Korea by June 30, 2007.

– Second we need to think of some sort of payoff for when one of these options obtains; the payoffs will be understood in terms of gambles, that is, taking one option as more likely than the other. (Here think of a reward for yourself, say getting $10)

» Gamble 1: Win $10, if B occurs, otherwise nothing

» Gamble 2: Win $10, if ~B occurs, otherwise nothing

There are three possibilities here

(a) Indifference — You might treat the two options as equally likely, thus you have no reason to accept Gamble 1 over Gamble 2.

(b) You want to take Gamble 1 — If you choose Gamble 1, then you feel that B is more likely to occur than ~B.

(c) You want to take Gamble 2 — If you choose Gamble 2, then you feel that ~B is more likely to occur than B.

Betting Rates and Payoff Matrixes:

• The examples thus far did not involve taking a risk—in that, you lose nothing if the gamble you chose above did not pay out.

• Another way of assigning numbers to confidence levels is to make the decision a risky one, that is, one where the participants stand to lose something by choosing incorrectly.

Betting Rates and Payoff Matrixes

• Notation:

– Suppose two people (J and K) make a bet that some event A will turn out.

– J bets amount $10 on A, and

– K bets amount $10 against A. (Betting against A is the same as betting on ¬A.)

– What is the total amount of money at stake between J and K?

» $20.

» The total amount is the sum of the individual bets

(i.e., $(10 + 10) ).

» The total amount here is called the stake.

Betting Rates and Payoff Matrixes

• In some sense the previous example is fair if the probability is 0.5

• Consider, what would you pay to enter a lottery in which the stake is 10,000 and the chance of winning is 0.1?

Betting Rate

– What we want to know is what a person’s betting rate is. This will be used as a proxy for their personal probability.

» The probability that you believe in some hypothesis will be equivalent to your betting on the hypothesis being true.

– The betting rate is calculated by taking your individual bet and dividing it by the stake:

» Your Betting Rate= Your Bet/Stake

» NOTE: We can also determine what a person’s bet is if we know the stake and their betting rate, by the following equation:

» Bet = (Personal Betting Rate) * (Stake)

» All that we have done here is solved the previous equation for the Bet, which involves multiplying both sides by the Stake.

Bets

• A bet involves two bettors X & Y

• X bets £x on H, • Y bets £y on ¬H (i.e. against H).

• The winner takes £(x+y)

• Thus the odds against H are y:x

• The betting rate of X on H is b(X) = x/(x+y)

Definition: Fair Bet

• A bet on H is fair for X iff X is indifferent between the two options of

• Betting on H with wager £x; or • Betting against H with wager £y. • A fair bet is one where X ’s betting rates £x/(x+y) and

£y/(x+y) are such that X does not prefer to bet either on H or on ¬H at those rates (or, equivalently, X wouldn’t prefer either side at those odds).

• A fair bet is not one we would take: in betting we usually wish to win, and hence to bet at unfair rates—unfair in our favour!

Betting Rates and degrees of belief.

• If agent X has betting rate b(H) in what X regards as a fair bet on H, then X ’s degree of belief in H is equal to b(H).

• As we define them, X need not actually bet to have a belief: X must simply have preferences which would make him indifferent between the two betting options, whether or not X ever acts on those preferences (or is ever faced by just those options).

• We have completed task one: devised a method for assigning numerical values to propositions.

Pay off Matrix

(1) Assume X bets on A, where the Stake = £S and X’s betting rate = p.

(2) Then X’s bet is £pS, which is what he stands to lose, he stands to gain£S(1-p), this can be shown in a pay off matrix:

If you bet for A, then: (i) If A occurs, you get S(1 - p), or (ii) If ~A occurs, you get –pS.

If you bet against A, then: (i) If A occurs, you get - S(1 - p); or(ii) If ~A occurs, you get pS.

Subjective Belief

• So now we have assigned numbers representing degrees of belief to propositions.

• Note these are so far entirely personal (subjective).

• Are there any rules governing these beliefs?

• Next we shall show that, remarkably, they obey the rules of probability.

Set of Propositions

• First we consider (rather informally) the set of propositions we are dealing with.

• These can be any set of propositions, P, providing they satisfy a Boolean algebra:– H in P; implies not H in P– H1 in P and H2 in P; implies (H1 V H2) in P

Recall standard rules of probability

Probability Function over propositions

Ramsey De Finetti Theorem

• We are now in a position to embark on our second task: proving that rational degrees of belief obey the rules of the probability calculus.

• That is, we must prove – Theorem (Ramsey De Finetti)

If X ’s degrees of belief are rational, then X ’s degrees of belief function defined by fair betting rates is (formally) a probability function.

• We begin by settling what it means for degrees of belief to be rational.

Rationality

• Defining ‘rationality’ in general is a huge task. It is extremely difficult, if not impossible, to give an uncontroversial account; it would be unfortunate if we had to give such an account before we could prove our theorem.

• Thankfully, we don’t have to. We do accept the following: • If we can show that non probabilistic belief would lead to

sure betting losses, we can infer from this Commonsense Principle that such belief are not rational.

Sure-Loss Contracts:

• The idea is that when your betting rates are inconsistent, we could think of a crafty person being able to develop a betting contract, by taking your betting rates, where you lose no matter what happens. When you can only be put in a situation where you are sure to lose if you betting rates are inconsistent, and this (as we will see) only happens when your betting rates violate the rules of probability.

Sure loss contract or Dutch Book

– BETTING CONTRACT DEFINED: A betting contract is a contract between two people to settle a bet at some agreed upon set of betting rates.

– BOOKMAKER DEFINED: A bookmaker is someone who makes betting contracts. He pays you if you win, and collects from you if you lose.

– SURE-LOSS CONTRACT DEFINED: A sure-loss contract is a betting contract in which you will lose no matter what outcome occurs.

Example of Sure Loss Contract:

– Recall from above:

» B = The US will bomb North Korea by June 30, 2005.

» Your betting rate on B is 5/8.

» ~B = The US will not bomb North Korea by June 30, 2005.

» Your betting rate on ~B is 3/4

– The bookmaker chooses which way to run the bet, given the betting rates that you have already selected. The bookmaker is a crafty fellow so he will take advantage of both your odds, so you will place independent bets for B and ~B.

The Bets

– Bets on B (at a rate of 5/8):

» You bet $5 and the bookmaker bets $3.

– Bets on ~B (at a rate of 3/4):

» You bet $6 and the bookmaker bets $2.

Result

– What happens if US bomb Korea (i.e., B occurs)?» What do you win?» You would win $3. This is what the bookmaker put up against

your bet on B.» What do you lose?» You would lose $6. This is what you bet on ~B.» What is the net payoff if B occurs?» YOU WOULD SUFFER A NET LOSS OF $3.

– What happens if US don’t bomb Korea (i.e., ~B occurs)?» What do you win?» You would win $2. This is what the bookmaker put up against

your bet on ~B..» What do you lose?» You would lose $5. This is what you be on B.» What is the net payoff if ~B occurs?» YOU AGAIN WOULD SUFFER A NET LOSS OF $3.

Pay Off Matrix

The basic idea is that, no matter how things turn out, you lose $3.

This is what a sure-loss contract looks like, since these were your own betting rates you cannot cry foul that you were mislead by the bookmaker.

The notion of a sure-loss contract gives us a way to define coherence among betting rates.

The Argument for Coherence from the Basic Rules of Probability

• Outline:

– What we are aiming for is a notion of a set of betting rates being coherent if and only if the set of betting rates satisfies the basic rules of probability.

– What we will do is show that when we develop betting rates that violate the rules of probability the result will be that a bookmaker can construct a sure-loss contract from our betting rates.

• First

– First we show all beliefs must lie within 0 and 1.

Normal Beliefs

• If b(H) is between 0 and 1 it is said to be normal.

• We can show that if we do not have belief that is normal then we are vulnerable to a sure loss contract.

• It is sufficient to show this for b(H) > 1 as – b(H) < 0 implies b(¬H) > 1.

Normal Beliefs

• If b(H) > 1 then our betting rate is expressible as x/(x+y) where x > (x+y) – hence y is negatitve.

• Hence we are indifferent between– Betting x on H and either losing x is H is false; or winning (x+y) if H is

true; but (x+y) < x so we lose either way: a sure loss

– Betting y on ¬H and either losing y (a win); or winning x+y; either way a sure win.

• Being indifferent between a sure loss and a sure win is irrational and the book maker can always make us lose.

Argument for Rule of Addition:

• When A and B are mutually exclusive events, – then Pr(A v B) = Pr(A) + Pr(B)

• If our betting rates are to meet this, then our betting for (AvB) should equal the sum of our betting rate for (A) and our betting rate for (B).

• Thus, if A and B are mutually exclusive, then the Addition Rule requires: – Betting rate on (A v B) = betting rate on A + betting rate on B

Strategy

– What we will do is see if a sure-loss contract can arise when we violate this rule.

» Consider three betting rates:

» Betting rate on A = p

» Betting rate on B = q

» Betting rate on A v B = r

» Assume that the restricted rule of addition is violated here.

» Thus: r < p + q. That is, assume that we assign a betting rate to r that is less than the sum of the betting rates for p and q. Thus, the sum of p and q will always be greater than r.

Layout of The Bets:

– Suppose a bookmaker asks you to make bets on your rates simultaneously, where the stake for each bet is $1.

» We bet $1 dollar so we can use the betting rates as proxies for the bets themselves.

– Bet 1:» Here we bet p on A.» What do we win if A occurs?» We win the remainder of the stake, which is (1 – p).» What do we lose if ~A occurs?» We lose our original bet of p.

– Bet 2:» Here we bet q on B.» What do we win if B occurs?» Here we win the remainder of the stake, which is (1 – q).» What do we lose if ~B occurs?» Here we lose our original bet q.

Layout of The Bets:

– Bet 3:

» Here we bet (1 – r) against (A v B).

» Since this is against either A or B, we will only win if neither A nor B occurs.

» What do we win if neither A or B occurs?

» Here we would win r.

» What do we lose if either A or B occurs?

» Here we would lose our original bet of (1 – r).

Setting up the Payoff Table:

Recall that we are assuming that r is less that the sum of p and q, that is, r < p + q.

If that is so, then r – (p + q) is always negative. That is, regardless of what p and q sum to, r must be less than that. Thus, we would lose given these better rates no matter which scenario took place.

Payoff (iii) is the easiest to see: Neither A or B are true.Here you win r, less p and q.Therefore, the net payoff is r – p – q.But since r is less than p and q, this means that regardless of what

r is, you lose [r - (p + q)].

Simple Numerical example

• Suppose I have b(A) = 0.2, b(B) = 0.3, b(A v B) = 0.4, hence b¬(A v B) = 0.6.

• The book keeper offers a £1 stake on each proposition ¬A and ¬B and (A v B)– If A & ¬B my gain is -0.1

• +0.8 - 0.3 - 0.6

– If B & ¬A my gain is -0.1• -0.2 + 0.7 -0.6

– If neither A nor B my gain is -0.1• -0.2 – 0.3 + 0.4

Simple Numerical example

• Suppose I have b(A) = 0.2, b(B) = 0.3, b(A v B) = 0.6 hence b¬(A v B) = 0.4.

• The book keeper offers £1 with me taking the bets ¬A and ¬B and (A v B) – If A my gain is -0.1

• -0.2 + 0.7 - 0.6

– If B my gain is -0.1• +0.8 - 0.3 – 0.6

– If neither A nor B my gain is -0.1• -0.2 – 0.3 + 0.4

Belief = Betting Rates

– Thus, we have a sure-loss contract, and this means that by violating the restricted rule of addition we end up with inconsistent betting rates.

– If our betting rates here had followed the restricted rule of addition, that is, if r = p + q, then there would have been no way to create a sure loss contract involving A, B, and A v B.

– LARGER POINT: It is a necessary and sufficient condition that a set of betting rates, including conditional betting rates, should be coherent is that they should satisfy the basic rules of probability.

Conditional Probability

• We have left out conditional probability but this can be easily demonstrated using similar arguments.

Quantifying Belief

• So now we have completed our first two tasks: – Assign numbers to measure the strength of a belief; – Prove that those numbers must be, formally, probabilities; and

• Next we need to– Provide a rule for updating beliefs given new evidence.


• We shall next do a little revision on conditional probability as it is necessary to know about this to define Bayes’ rule.


• Many times, however, we take the probability of one event to be dependent on another event occurring (or proposition being true).

– Example of Conditional Probability Statement:

– P2: Given that she starts the chemotherapy immediately, Sarah has a 70% of stopping the spread of the cancer.

• Conditional probabilities are complex statements—they are composed of two constituent statements.

– In each case here, there are two sentences (or events) that are taken to be related to one another:

– The probability of one of the statements is dependent on the other statement being true.


– NOTATION FOR CONDITIONAL PROBABILITY: Pr(•/•);– e.g., Pr(A/B)

– We can read conditional probabilities in the following ways:• The probability of A given B.

• The probability of A on B.

• The probability of A on the condition that B.

• Definition of Conditional Probability


• Definition of Conditional Probability

a) Read: (1) When the probability of B is greater than 0, then

the probability of A given B is equal to the probability of A and B jointly occurring, divided by the probability of B occurring without A.

(2) Why must B be greater than 0?(a) Because we cannot divide by 0.

Quantifying Belief

• So now we have completed our first two tasks: – Assign numbers to measure the strength of a belief; – Prove that those numbers must be, formally, probabilities; and

• Next we need to– Provide a rule for updating beliefs given new evidence.

Relation of Evidence to a Hypothesis

– What we will be interested in today is the relationship between a hypothesis and a piece of evidence which either supports or undermines our belief in that hypothesis.

– That is, we will be interested in using the tools of probability as a model for seeing how we should think of our confidence in this hypothesis after being given this piece of evidence.

Relation of Evidence to a Hypothesis

– What we are going to find is that the probability that a hypothesis is true given some evidence (i.e., Pr(H / E) ) is related in an important way to the probability that the evidence is true given the hypothesis (i.e., Pr(E / H) ).

– This is the cornerstone of Bayes’ Theorem and what we will come to know as Bayesian Reasoning.

Bayes Theorem

If there are two hypothese conditional probability can be manipulatedto yield Bayes rule:

Components

• Prior (Antecedent) Probabilities:

– There are usually just two, but there can be an infinite number so long as the set is mutually exclusive and jointly exhaustive: Pr(H) and Pr(~H)

– The prior probabilities are simply how likely we think that a hypothesis is true prior to taking into account the new evidence.

Components

• Likelihoods:

– There is one likelihood for each prior probability, so in the simple case there will be just two likelihoods: • Pr(E / H) and Pr(E / ~H)

– Likelihoods are conditional probabilities, they are the probability that a given piece of evidence would obtain, given that a certain hypothesis is true. Thus:• There is the likelihood of the evidence (E) being true, given that the

hypothesis (H) is true, and there is the likelihood of the evidence (E) being true, given that the negation of the hypothesis (~H) is true.

Components

• Posterior Probability — Pr(H / E)

– There will always only be one of these, regardless of how many prior probabilities we are considering. This is what we are looking to solve for in using Bayes Theorem.

– We call Pr(H / E) a posterior probability is because it is the new probability that we assign to our hypothesis H after considering the new piece of evidence E.

Bayes Theorem

If there are two hypothese conditional probability can be manipulatedto yield Bayes rule:

Notes

– The numerator is the prior (antecedent) probability multiplied by the likelihood that the evidence is true given prior probability is.• This is simply the probability that H & E will both occur.

– The denominator is simply the Rule of Total Probability.• Thus, here this is simply the probability that the evidence would

occur regardless of which hypothesis turns out to be true.

– IDIOMATIC EXPLANATION: • We have think of Bayes Theorem as relating the probability of a

hypothesis being true given some new piece of evidence obtains to the probability of the hypothesis being true and the evidence obtaining divided by the probability of the evidence obtaining regardless of whether our hypothesis is true or not.

Bayesian Induction

• How are we going to show that Bayes theorem allows us to revise our hypotheses under in light of new evidence?– We are going to show this by re-running the theorem for a

hypothesis when we get new pieces of information, and then record the differences between our prior probabilities and our posterior probabilities.

– When we re-run the theorem what we do is take the posterior probability of our previous calculation and treat it as our prior probability in the current calculation.

Bayesian Confirmation

• Bayes Theorem gives us a device for belief revision under new evidence.

• Bayes Theorem shows two things:– First, as new evidence comes in to corroborate a hypothesis,

our confidence in that hypothesis should go up (this process we can call confirmation), and

– Second, as new information comes in that undermines a hypothesis, our confidence in level in the hypothesis should go down (this process we can call disconfirmation).

Example

• There is a robotic agent monitoring access to an airport. Every day thousands of people come to the airport, however there is also a danger of terrorist attack. So the agent uses a computer vision face recognition system to determine whether the people passing into the airport look like someone on a data base of suspected terrorists. The agent must make a decision as to whether to indicate someone is a terrorist or not i.e. there are two possible actions for the agent: to stop or let through and two possible states of a given person: terrorist or not terrorist

Example

• Let H = the proposition that a given person is a terrorist

• Let E = the evidence from the video camera

Example

• Based on video evidence the agent infers that the likelihood a person – is a terrorist is p(E|H) = p(video|terrorist) = 0.1 and that – is not a terrorist is p(E|¬H) = p(video|not terrorist) = 0.5 – (note these two need not sum to one as they are distinct

distributions).

• The priors are – is a terrorist is p(H) = p(terrorist) = 0.0001 and that – is not a terrorist is p(¬H) = p(not terrorist) = 0.9999

Work Out Solution

Where should priors come from

• There is a large debate on this.

• Objectivists state that one should be able to logically infer the best prior to use so that any two people should agree.

• Subjectivists state that the prior is up to the individual as long as he is coherent.

The Marquis de Laplace’s rule of Indifference.

• The principle of indifference is a rule for assigning epistemic probabilities. Suppose that there are n > 1 mutually exclusive and exhaustive possibilities. The principle of indifference states that if the n possibilities are indistinguishable except for their names, in the absence of all other knowledge, then each possibility is assigned a probability equal to 1/n. That is, if the names (what we call each possibility) are interchangeable, the possibilities have equal probability.

• The principle of indifference is meaningless under the frequency interpretation of probability, in which probabilities are relative frequencies rather than degrees of belief in uncertain propositions, conditional upon a state of information.

How should the robot Act?

• So far we have been concerned with probabilities as a measure of the degree of confidence that we should have in our beliefs.

• Decisions, however, require more than just probabilities.

• Next we show how to relate belief to action.

EXPECTED VALUE (UTILITY)

• A basic model of decision requires two things, if a person decides to do X:– Component 1: A desire that some state of affairs SA come

about, and– Component 2: A belief that action X will bring about SA.

• Using the tools of probability we have worked with so far, we are only able to make sense of Component 2. So, far we can only use probability theory to measure the probability that X will bring about SA (Component 1).

EXPECTED VALUE (UTILITY)

• IMPORTANT COMBINATION: The idea is that in coming to make a decision we want to be able to assess the relative merits of the action in question (what we will call the utility of the state of affairs that the action will bring about), and this is not just the likelihood of it obtaining.

• Utility equates to desire.

Notation

• UTILITY DEFINED: The utility of some consequence C is the degree to which we desire that C obtain.

• Notation:

– Terms:

» Capital letter A denotes an act: A

» Capital letter C followed by a subscript denotes consequences: C1, C2, C3, etc.

– Operators:

» The utility of a possible consequence C1, written as: U(C1)

» The probability of C1 happening, if act A is done, written as: Pr(C1 / A)

» The expected value of action A, written as: Exp(A)

Example

– Suppose there is a single action A with two consequences C1 and C2.

» Since there are two consequences, there are two utilities and two probabilities:

» The utilities are: U(C1), U(C2)

» The probabilities are: Pr(C1 / A), Pr(C2 / A)

– Given this we can define the expected value of A is:

» Exp(A) = Pr(C1 / A) U(C1) + Pr(C2 / A) U(C2)

» The idea is that we treat the utility of some consequence as something that we can multiply against the probability of an act bringing about that very consequence.

Expected Value

• EXPECTED VALUE DEFINED: The expected value of an action is the sum of the individual products of the utilities and their matching probabilities

• Definition of The Expected Utility Hypothesis: The expected utility hypothesis is the hypothesis that the utility of an agent facing uncertainty is calculated by considering utility in each possible state and constructing a weighted average, where the weights are the agent's estimate of the probability of each state. Arrow, 1963 attributes to Daniel Bernoulli (1738) the earliest known written statement of this hypothesis.

ST. PETERSBURG PARADOX:

• Here we imagine that a fair coin is involved in this chance setup. Now, consider that the coin has been tossed, and comes up one side. If, on the first toss, it comes up heads, then you win $2 and the game stops; if tails the coin is tossed again. If, on the second toss, it comes up heads, then you win $4 and the game stops; if tails the coin is tossed again. If, on the third toss, it comes up heads, then you win $8 and the game stops; if tails the coin is tossed again. And so on, for each succeeding toss. The idea is that the game only stops when a head is thrown, and that for whatever nth toss you are on, your winnings will be $(2)n, if the coin lands on heads. Suppose now that you are given a chance to play the game, but an entry fee is being charged.

Question

– The Question is: What is the fair price to play such a game?

• A price is considered fair if it is equal to or reasonably close to the expected value of a play of the game

Problem

• What is the ‘expected value’ of this game?

– Well it is simply the expected payoffs of all the potential stages of the game.

– What is the payoff at each stage of the game? $1.

– How many (potential) stages are their to the game? There are an infinite number.

• Thus, the expected value is an infinite number of dollars

Why is this a problem?

• A rational gambler would be willing to enter the game just in case the price of entry was less than expected value.– In this game, any finite price of entry is smaller than the

expected value of the game. Thus, the rational gambler should play no matter how large the finite price was. There simply is no upper bound on what is a fair price to play the game.

– Of course, some prices are too high for a rational agent to pay to play.

– This is why it is considered a paradox—because it seems to be a ridiculous conclusion, but one that is not the result of any error or miscalculation.

Answer Diminishing Marginal Utility

• What do we mean by ‘marginal utility’?

• The marginal utility of some quantity X, given a previous quantity Y, is the utility that X would add to the person’s fortune if they already have Y.

• NOTE: The idea is that if we are comparing winning Y (which we might say is at Stage 1 in the game) to winning Y+X (which we might say is at Stage 2 in the game), the additional value of X is not X but the marginal utility of winning X, given Y.

Diminishing Marginal Utility

» What do we mean by ‘diminishing marginal utility’?

» The idea is that even though you win twice as much money at every succeeding stage (that is, for every nth stage the winning’s double for the n+1 stage), the value of the winnings to you diminishes in proportion to the size of the amount of money you already have.

Diminishing Marginal Utility

» Think about it this way: you have $10,000 to give away. To whom would the money have more utility, a homeless person or Bill Gates?

» It would seem that $10,000 would matter relatively little to Bill Gates (given that his net worth is roughly $46 Billion dollars, i.e., $46,000,000,000), whereas it would matter a great deal to a person living on the street.

Next Some Examples of Decision Theory In Action

• Consider the terrorist example again.

• The agent must make a decision as to whether to indicate someone is a terrorist or not i.e. there are two possible actions for the agent: to stop or let through and two possible states of a given person: terrorist or not terrorist. Each combination of state and action has a cost/benefit as follows:

•

Example 1

• Based on video evidence the agent infers that the likelihood a person is a terrorist is – p(video|terrorist) = p(E|H) = 0.1 and that – p(video|not terrorist) = p(E|¬H) = 0.5.

• Suppose we have priors– p(terrorist) = p(not terrorist) = p(H) = p(¬H) = ½ – perhaps not a very realistic assumption

Using Bayes Theorem

So our posterior beliefs are

Belief in terrorist given videoP(H|E) = (0.1 * 0.5) / (0.1 * 0.5 + 0.5 * 0.5 ) = 1/6

Belief not terrorist given videoP(¬H|E) = (0.5 * 0.5) / (0.1 * 0.5 + 0.5 * 0.5 ) = 5/6

Example 1

• Based on video evidence the agent infers that the likelihood a person is a terrorist is – p(terrorist|video) = 1/6 and that – p(not terrorist|video) = 5/6. – What action should the vision system advocate, and why?

– Expected utility for stopping then is 1/6 * 10000 – 5/6 * 10– Expected utility for not stopping is 1/6 * -10000 + 5/6 * 10– Therefore we should stop them

Example 2

• For a given person suppose we have the following two likelihoods– p(video|terrorist) = a

– p(video|not terrorist) = b

• What likelihood ratio a/b would cause the agent some indecision?

– Expected utility for stopping then is a *10000 – b* 10

– Expected utility for not stopping is a * -10000 + b* 10

• So when both actions have expected utility then: a * 10000 – b* 10 = a -10000 + b* 10, which implies 1000 a = b, thus– a+b = 1

– a = 1/1001

– b = 1000/1001

Books

• Ian Hacking, Introductory Probability

Documents

Robotic Rational Reasoning! Lecture 2 P.H.S. Torr