44
CS 343: Artificial Intelligence Uncertainty and Utilities Prof. Scott Niekum The University of Texas at Austin [These slides are based on those of Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at http://ai.berkeley.edu.]

Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

CS343:ArtificialIntelligenceUncertaintyandUtilities

Prof.ScottNiekum

TheUniversityofTexasatAustin[TheseslidesarebasedonthoseofDanKleinandPieterAbbeelforCS188IntrotoAIatUCBerkeley.AllCS188materialsareavailableathttp://ai.berkeley.edu.]

Page 2: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

UncertainOutcomes

Page 3: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

Worst-Casevs.AverageCase

10 10 9 100

max

min

Idea:Uncertainoutcomescontrolledbychance,notanadversary!

chance

Page 4: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

ExpectimaxSearch

▪ Whywouldn’tweknowwhattheresultofanactionwillbe?▪ Explicitrandomness:rollingdice▪ Unpredictableopponents:theghostsrespondrandomly▪ Actionscanfail:whenmovingarobot,wheelsmightslip

▪ Valuesshouldnowreflectaverage-case(expectimax)outcomes,notworst-case(minimax)outcomes

▪ Expectimaxsearch:computetheaveragescoreunderoptimalplay▪ Maxnodesasinminimaxsearch▪ Chancenodesarelikeminnodesbuttheoutcomeisuncertain▪ Calculatetheirexpectedutilities▪ I.e.takeweightedaverage(expectation)ofchildren

▪ Later,we’lllearnhowtoformalizetheunderlyinguncertain-resultproblemsasMarkovDecisionProcesses

10 4 5 7

max

chance

10 10 9 100

Page 5: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

MinimaxvsExpectimax(Min)

End your misery!

Page 6: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

MinimaxvsExpectimax(Exp)

Hold on to hope, Pacman!

Page 7: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

ExpectimaxPseudocode

defvalue(state):ifthestateisaterminalstate:returnthestate’sutilityifthenextagentisMAX:returnmax-value(state)ifthenextagentisEXP:returnexp-value(state)

defexp-value(state):initializev=0foreachsuccessorofstate: p=probability(successor)

v+=p*value(successor)returnv

defmax-value(state):initializev=-∞ foreachsuccessorofstate:

v=max(v,value(successor))returnv

Page 8: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

ExpectimaxPseudocode

defexp-value(state):initializev=0foreachsuccessorofstate: p=probability(successor)

v+=p*value(successor)returnv 5 78 24 -12

1/21/3

1/6

v=(1/2)(8)+(1/3)(24)+(1/6)(-12)=10

Page 9: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

ExpectimaxExample

12 9 6 03 2 154 6

8 4 7

8

Page 10: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

ExpectimaxPruning?

12 93 2

Page 11: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

Depth-LimitedExpectimax

492 362 …

400 300

Estimateoftrueexpectimaxvalue(whichwouldrequirealotof

worktocompute)

Page 12: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

Probabilities

Page 13: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

Reminder:Probabilities

▪ Arandomvariablerepresentsaneventwhoseoutcomeisunknown▪ Aprobabilitydistributionisanassignmentofweightstooutcomes

▪ Example:Trafficonfreeway▪ Randomvariable:T=whetherthere’straffic▪ Outcomes:Tin{none,light,heavy}▪ Distribution:P(T=none)=0.25,P(T=light)=0.50,P(T=heavy)=0.25

▪ Somelawsofprobability(morelater):▪ Probabilitiesarealwaysnon-negative▪ Probabilitiesoverallpossibleoutcomessumtoone

▪ Aswegetmoreevidence,probabilitiesmaychange:▪ P(T=heavy)=0.25,P(T=heavy|Hour=8am)=0.60▪ We’lltalkaboutmethodsforreasoningandupdatingprobabilitieslater

0.25

0.50

0.25

Page 14: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

▪ Theexpectedvalueofafunctionofarandomvariableistheaverage,weightedbytheprobabilitydistributionoveroutcomes

▪ Example:Howlongtogettotheairport?

Reminder:Expectations

0.25 0.50 0.25Probability:

20min 30min 60minTime:35minx x x+ +

Page 15: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

▪ Inexpectimaxsearch,wehaveaprobabilisticmodelofhowtheopponent(orenvironment)willbehaveinanystate▪ Modelcouldbeasimpleuniformdistribution(rolladie)▪ Modelcouldbesophisticatedandrequireagreatdealof

computation▪ Wehaveachancenodeforanyoutcomeoutofourcontrol:

opponentorenvironment▪ Themodelmightsaythatadversarialactionsarelikely!

▪ Fornow,assumeeachchancenodemagicallycomesalongwithprobabilitiesthatspecifythedistributionoveritsoutcomes

WhatProbabilitiestoUse?

Havingaprobabilisticbeliefaboutanotheragent’sactiondoesnotmeanthattheagentisflippinganycoins!

Page 16: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

WhatareProbabilities?

▪ Objectivist/frequentistanswer:▪ Averagesoverrepeatedexperiments▪ E.g.empiricallyestimatingP(rain)fromhistoricalobservation▪ Assertionabouthowfutureexperimentswillgo(inthelimit)▪ Newevidencechangesthereferenceclass▪ Makesonethinkofinherentlyrandomevents,likerollingdice

▪ Subjectivist/Bayesiananswer:▪ Degreesofbeliefaboutunobservedvariables▪ E.g.anagent’sbeliefthatit’sraining,giventhetemperature▪ E.g.pacman’sbeliefthattheghostwillturnleft,giventhestate▪ Oftenlearnprobabilitiesfrompastexperiences(morelater)▪ Newevidenceupdatesbeliefs(morelater)

Page 17: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

Quiz:InformedProbabilities

▪ Let’ssayyouknowthatyouropponentisactuallyrunningadepth2minimax,usingtheresult80%ofthetime,andmovingrandomlyotherwise

▪ Question:Whattreesearchshouldyouuse?

0.10.9

▪ Answer:Expectimax!▪ TofigureoutEACHchancenode’sprobabilities,

youhavetorunasimulationofyouropponent

▪ Thiskindofthinggetsveryslowveryquickly

▪ Evenworseifyouhavetosimulateyouropponentsimulatingyou…

▪ …exceptforminimax,whichhasthenicepropertythatitallcollapsesintoonegametree

Page 18: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

ModelingAssumptions

Page 19: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

TheDangersofOptimismandPessimism

DangerousOptimismAssumingchancewhentheworldisadversarial

DangerousPessimismAssumingtheworstcasewhenit’snotlikely

Page 20: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

Assumptionsvs.Reality

AdversarialGhost RandomGhost

MinimaxPacman

Won5/5

Avg.Score:483

Won5/5

Avg.Score:493

ExpectimaxPacman

Won1/5

Avg.Score:-303

Won5/5

Avg.Score:503

Resultsfromplaying5games

Pacmanuseddepth4searchwithanevalfunctionthatavoidstroubleGhostuseddepth2searchwithanevalfunctionthatseeksPacman

Page 21: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

VideoofDemoWorldAssumptions RandomGhost–ExpectimaxPacman

Page 22: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

VideoofDemoWorldAssumptions RandomGhost–MinimaxPacman

Page 23: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

VideoofDemoWorldAssumptions AdversarialGhost–MinimaxPacman

Page 24: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

VideoofDemoWorldAssumptions AdversarialGhost–ExpectimaxPacman

Page 25: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

OtherGameTypes

Page 26: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

MixedLayerTypes

▪ E.g.Backgammon

▪ Expectiminimax▪ Environmentisanextra“randomagent”playerthatmovesaftereachmin/maxagent

▪ Eachnodecomputestheappropriatecombinationofitschildren

Page 27: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

Multi-AgentUtilities

▪ Whatifthegameisnotzero-sum,orhasmultipleplayers?

▪ Generalizationofminimax:▪ Terminalshaveutilitytuples▪ Nodevaluesarealsoutilitytuples▪ Eachplayermaximizesitsowncomponent▪ Cangiverisetocooperationand competitiondynamically…

1,6,6 7,1,2 6,1,2 7,2,1 5,1,7 1,5,2 7,7,1 5,2,5

Page 28: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

Utilities

Page 29: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

MaximumExpectedUtility

▪ Whyshouldweaverageutilities?Whynotminimax?

▪ Principleofmaximumexpectedutility:▪ Arationalagentshouldchosetheactionthatmaximizesitsexpected

utility,givenitsknowledge

▪ Questions:▪ Wheredoutilitiescomefrom?▪ Howdoweknowsuchutilitiesevenexistthatrepresentourpreferences?▪ Howdoweknowthataveragingevenmakessense?▪ Whatifourbehavior(preferences)can’tbedescribedbyutilities?

Page 30: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

WhatUtilitiestoUse?

▪ Forworst-caseminimaxreasoning,terminalfunctionscaledoesn’tmatter

▪ Wejustwantbetterstatestohavehigherevaluations(gettheorderingright)▪ Wecallthisinsensitivitytomonotonictransformations

▪ Foraverage-caseexpectimaxreasoning,weneedmagnitudestobemeaningful

0 40 20 30 x2 0 1600 400 900

Page 31: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

Utilities

▪ Utilitiesarefunctionsfromoutcomes(statesoftheworld)torealnumbersthatdescribeanagent’spreferences

▪ Wheredoutilitiescomefrom?▪ Inagame,maybesimple(+1/-1)▪ Utilitiessummarizetheagent’sgoals▪ Theorem:any“rational”preferencescanbe

summarizedasautilityfunction

▪ Wehard-wireutilitiesandletbehaviorsemerge▪ Whydon’tweletagentspickutilities?▪ Whydon’tweprescribebehaviors?

Page 32: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

Utilities:UncertainOutcomes

Gettingicecream

GetSingle GetDouble

Oops Whew!

Page 33: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

Preferences

▪ Anagentmusthavepreferencesamong:▪ Prizes:A, B,etc.▪ Lotteries:situationswithuncertainprizes

▪ Notation:▪ Preference:▪ Indifference:

A B

p 1-p

ALotteryAPrize

A

Page 34: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

Rationality

Page 35: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

▪ Wewantsomeconstraintsonpreferencesbeforewecallthemrational,suchas:

▪ Forexample:anagentwithintransitivepreferencescan beinducedtogiveawayallofitsmoney

▪ IfB>C,thenanagentwithCwouldpay(say)1centtogetB▪ IfA>B,thenanagentwithBwouldpay(say)1centtogetA▪ IfC>A,thenanagentwithAwouldpay(say)1centtogetC

RationalPreferences

)()()( CACBBA ≻≻≻ ⇒∧AxiomofTransitivity:

Page 36: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

RationalPreferences

Theorem:Rationalpreferencesimplybehaviordescribableasmaximizationofexpectedutility

TheAxiomsofRationality

Page 37: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

▪ Theorem[Ramsey,1931;vonNeumann&Morgenstern,1944]▪ Givenanypreferencessatisfyingtheseconstraints,thereexistsareal-valued functionUsuchthat:

▪ I.e.valuesassignedbyUpreservepreferencesofbothprizesandlotteries!

▪ Maximumexpectedutility(MEU)principle:▪ Choosetheactionthatmaximizesexpectedutility▪ Note:anagentcanbeentirelyrational(consistentwithMEU)withouteverrepresentingormanipulating

utilitiesandprobabilities▪ E.g.,alookuptableforperfecttic-tac-toe,areflexvacuumcleaner

MEUPrinciple

Page 38: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

HumanUtilities

Page 39: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

UtilityScales

▪ Normalizedutilities:u+=1.0,u-=0.0

▪ Micromorts:one-millionthchanceofdeath,usefulforpayingtoreduceproductrisks,etc.

▪ QALYs:quality-adjustedlifeyears,usefulformedicaldecisionsinvolvingsubstantialrisk

▪ Note:behaviorisinvariantunderpositivelineartransformation

▪ Withdeterministicprizesonly(nolotterychoices),onlyordinalutilitycanbedetermined,i.e.,totalorderonprizes.Todeterminemagnitudes,mustaskquestionsaboutlotterypreferences.

Page 40: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

▪ Utilitiesmapstatestorealnumbers.Whichnumbers?

▪ Standardapproachtoassessment(elicitation)ofhumanutilities:

▪ CompareaprizeAtoastandardlotteryLpbetween▪ “bestpossibleprize”u+withprobabilityp

▪ “worstpossiblecatastrophe”u-withprobability1-p

▪ Adjustlotteryprobabilitypuntilindifference:A~Lp

▪ Resultingpisautilityin[0,1]

HumanUtilities

0.000001

Nopay

Pay$30

Instantdeath

0.999999

Page 41: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

Money

▪ Moneydoesnotbehaveasautilityfunction,butwecantalkabouttheutilityofhavingmoney(orbeingindebt)

▪ GivenalotteryL=[p,$X;(1-p),$Y]

▪ TheexpectedmonetaryvalueEMV(L)isp*X+(1-p)*Y▪ U(L)=p*U($X)+(1-p)*U($Y)▪ Typically,U(L)<U(EMV(L))▪ Inthissense,peoplearerisk-averse▪ Whendeepindebt,peoplearerisk-prone

Page 42: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

Example:Insurance

▪ Considerthelottery[0.5,$1000;0.5,$0]▪ Whatisitsexpectedmonetaryvalue?($500)▪ Whatisitscertaintyequivalent?

▪ Monetaryvalueacceptableinlieuoflottery▪ $400formostpeople

▪ Differenceof$100istheinsurancepremium▪ There’saninsuranceindustrybecausepeoplewillpaytoreducetheirrisk

▪ Ifeveryonewererisk-neutral,noinsuranceneeded!

▪ It’swin-win:you’dratherhavethe$400andtheinsurancecompanywouldratherhavethelottery(theirutilitycurveisflatandtheyhavemanylotteries)

Page 43: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

Example:HumanRationality?

▪ FamousexampleofAllais(1953)

▪ A:[0.8,$4k;0.2,$0]▪ B:[1.0,$3k;0.0,$0]

▪ C:[0.2,$4k;0.8,$0]▪ D:[0.25,$3k;0.75,$0]

▪ MostpeoplepreferB>A,C>D

▪ ButifU($0)=0,then▪ B>A⇒ U($3k)>0.8U($4k)▪ C>D⇒ 0.8U($4k)>U($3k)(multbothsidesby4—lineartransformsareOK)

Page 44: Uncertainty and Utilitiessniekum/classes/343-S20/... · 2019. 8. 29. · Makes one think of inherently random events, like rolling dice Subjectivist / Bayesian answer: Degrees of

NextTime:MDPs!