CSE 473: Ar+ﬁcial Intelligence - University of Washington · 2018. 1. 22. · § Chance nodes are like min nodes but the outcome is uncertain § Calculate their expected u+li+es

CSE473:Ar+ficialIntelligenceUncertaintyandExpec+maxTreeSearch

Instructors:LukeZeDlemoyer

UniveristyofWashington[TheseslideswereadaptedfromDanKleinandPieterAbbeelforCS188IntrotoAIatUCBerkeley.AllCS188materialsareavailableathDp://ai.berkeley.edu.]

UncertainOutcomes

Worst-Casevs.AverageCase

10 10 9 100

max

min

Idea:Uncertainoutcomescontrolledbychance,notanadversary!

Expec+maxSearch

§  Whywouldn’tweknowwhattheresultofanac+onwillbe?§  Explicitrandomness:rollingdice§  Unpredictableopponents:theghostsrespondrandomly§  Ac+onscanfail:whenmovingarobot,wheelsmightslip

§  Valuesshouldnowreflectaverage-case(expec+max)outcomes,notworst-case(minimax)outcomes

§  Expec+maxsearch:computetheaveragescoreunderop+malplay§  Maxnodesasinminimaxsearch§  Chancenodesarelikeminnodesbuttheoutcomeisuncertain§  Calculatetheirexpectedu+li+es§  I.e.takeweightedaverage(expecta+on)ofchildren

§  Later,we’lllearnhowtoformalizetheunderlyinguncertain-resultproblemsasMarkovDecisionProcesses

10 4 5 7

max

chance

10 10 9 100

[Demo:minvsexp(L7D1,2)]

VideoofDemoMinvs.Exp(Min)

VideoofDemoMinvs.Exp(Exp)

Expec+maxPseudocode

defvalue(state):ifthestateisaterminalstate:returnthestate’su+lityifthenextagentisMAX:returnmax-value(state)ifthenextagentisEXP:returnexp-value(state)

defexp-value(state):ini+alizev=0foreachsuccessorofstate: p=probability(successor)v+=p*value(successor)

returnv

defmax-value(state):ini+alizev=-∞ foreachsuccessorofstate:

v=max(v,value(successor))returnv

Expec+maxPseudocode

defexp-value(state):ini+alizev=0foreachsuccessorofstate: p=probability(successor)v+=p*value(successor)

returnv

5 78 24 -12

1/21/3

1/6

v=(1/2)(8)+(1/3)(24)+(1/6)(-12)=10

Expec+maxExample

12 9 6 0 3 2 15 4 6

Expec+maxPruning?

12 9 3 2

Depth-LimitedExpec+max

…

…

492 362 …

400 300Es+mateoftrueexpec+maxvalue(whichwouldrequirealotof

worktocompute)

Probabili+es

Reminder:Probabili+es§  Arandomvariablerepresentsaneventwhoseoutcomeisunknown§  Aprobabilitydistribu+onisanassignmentofweightstooutcomes

§  Example:Trafficonfreeway§  Randomvariable:T=whetherthere’straffic§  Outcomes:Tin{none,light,heavy}§  Distribu+on:P(T=none)=0.25,P(T=light)=0.50,P(T=heavy)=0.25

§  Somelawsofprobability(morelater):§  Probabili+esarealwaysnon-nega+ve§  Probabili+esoverallpossibleoutcomessumtoone

§  Aswegetmoreevidence,probabili+esmaychange:§  P(T=heavy)=0.25,P(T=heavy|Hour=8am)=0.60§  We’lltalkaboutmethodsforreasoningandupda+ngprobabili+eslater

0.25

0.50

0.25

§  Theexpectedvalueofafunc+onofarandomvariableistheaverage,weightedbytheprobabilitydistribu+onoveroutcomes

§  Example:Howlongtogettotheairport?

Reminder:Expecta+ons

0.25 0.50 0.25Probability:

20min 30min 60minTime:35minx x x+ +

§  Inexpec+maxsearch,wehaveaprobabilis+cmodelofhowtheopponent(orenvironment)willbehaveinanystate§  Modelcouldbeasimpleuniformdistribu+on(rolladie)§  Modelcouldbesophis+catedandrequireagreatdealof

computa+on§  Wehaveachancenodeforanyoutcomeoutofourcontrol:

opponentorenvironment§  Themodelmightsaythatadversarialac+onsarelikely!

§  Fornow,assumeeachchancenodemagicallycomesalongwithprobabili+esthatspecifythedistribu+onoveritsoutcomes

WhatProbabili+estoUse?

Havingaprobabilis.cbeliefaboutanotheragent’sac.ondoesnotmeanthattheagentisflippinganycoins!

Quiz:InformedProbabili+es

§  Let’ssayyouknowthatyouropponentisactuallyrunningadepth2minimax,usingtheresult80%ofthe+me,andmovingrandomlyotherwise

§  Ques+on:Whattreesearchshouldyouuse?

0.10.9

§  Answer:Expec+max!§  TofigureoutEACHchancenode’sprobabili+es,

youhavetorunasimula+onofyouropponent§  Thiskindofthinggetsveryslowveryquickly§  Evenworseifyouhavetosimulateyour

opponentsimula+ngyou…§  …exceptforminimax,whichhasthenice

propertythatitallcollapsesintoonegametree

ModelingAssump+ons

TheDangersofOp+mismandPessimism

DangerousOp+mismAssumingchancewhentheworldisadversarial

DangerousPessimismAssumingtheworstcasewhenit’snotlikely

Assump+onsvs.Reality

AdversarialGhost RandomGhost

MinimaxPacman

Expec+maxPacman

[Demos:worldassump+ons(L7D3,4,5,6)]

Resultsfromplaying5games

Pacmanuseddepth4searchwithanevalfunc+onthatavoidstroubleGhostuseddepth2searchwithanevalfunc+onthatseeksPacman

Won5/5

Avg.Score:503

Won5/5

Avg.Score:493

Won1/5

Avg.Score:-303

Won5/5

Avg.Score:483

VideoofDemoWorldAssump+onsRandomGhost–Expec+maxPacman

VideoofDemoWorldAssump+onsAdversarialGhost–MinimaxPacman

VideoofDemoWorldAssump+onsAdversarialGhost–Expec+maxPacman

VideoofDemoWorldAssump+onsRandomGhost–MinimaxPacman

OtherGameTypes

MixedLayerTypes

§  E.g.Backgammon§  Expec+minimax

§  Environmentisanextra“randomagent”playerthatmovesaxereachmin/maxagent

§  Eachnodecomputestheappropriatecombina+onofitschildren

Example:Backgammon

§  Dicerollsincreaseb:21possiblerollswith2dice§  Backgammon≈20legalmoves§  Depth2=20x(21x20)3=1.2x109

§  Asdepthincreases,probabilityofreachingagivensearchnodeshrinks§  Sousefulnessofsearchisdiminished§  Solimi+ngdepthislessdamaging§  Butpruningistrickier…

§  HistoricAI:TDGammonusesdepth-2search+verygoodevalua+onfunc+on+reinforcementlearning:world-championlevelplay

§  1stAIworldchampioninanygame!Image:Wikipedia

Mul+-AgentU+li+es

§  Whatifthegameisnotzero-sum,orhasmul+pleplayers?

§  Generaliza+onofminimax:§  Terminalshaveu+litytuples§  Nodevaluesarealsou+litytuples§  Eachplayermaximizesitsowncomponent§  Cangiverisetocoopera+onandcompe++ondynamically…

1,6,6 7,1,2 6,1,2 7,2,1 5,1,7 1,5,2 7,7,1 5,2,5

Documents

CSE 473: Ar+ﬁcial Intelligence - University of Washington · 2018. 1. 22. · § Chance nodes are like min nodes but the outcome is uncertain § Calculate their expected u+li+es