Introduction to AI - Eight Lecture

IntroductiontoAI8th Lecture

1990’s– Agents

[email protected]

Origin0 Rationalagentconceptfromeconomy.0 Utilitytheory:thetheoryofpreferredoutcomes.0 Decisiontheory:thedynamicsofutilitymaximizationinanunpredictableenvironment.

0 Gametheory:thedynamicsofutilitymaximizationwhenparticipantsaffecteachother’sutilityinapredictableway.

Agent0 Agent:

0 Perceivetheenvironmentthroughsensors.0 Actontheenvironmentthroughactuators.0 Theenvironmentcanbenon‐physical.

0 Percept:thesetofperceptionsatsomepointintime.0 Perceptsequence:thesetofaperception‐timepairs.0 Agentfunction:perceptsequence action0 Agentprogram:animplementationofanagentfunction.

0 Agentarchitecture

Rationality0 Arationalbeingconsidersalltheconsequencesofallpossibleactions,andmakestheseconsequencespartofthedecisionprocessesforperformingeachofthoseactions.

0 Givenanenvironmentandaperceptsequence,whatisthe‘best’thingtodo?

0 Performancemeasure:objectiveassessmentofthevalueofsuccessofanarbitraryenvironmentsequence.

RationalagentDependentvariables:1. Priorknowledgeoftheagent.2. Performancemeasureofenvironmentstate

sequence.3. Possibleactionstheagentcanperform.4. Perceptsequenceoftheagent.

0 Informationgathering:performing(3)inordertoenrich(4)andtherebyincrease(1).

0 Learning:increase(1)through(4).0 Autonomy:allof(1)relatesbackto(4).

Taskenvironment0 Fully/partiallyobservable0 Single/multiagent(competitive/cooperative)0 Deterministic/stochastic0 Episodic/sequential0 Static/dynamic/semidynamic0 Discrete/continuous0 Known/unknown

0 Blocksworld:fullyobservable,singleagent,deterministic,episodic,static,knownenvironment.

0 1990’s:partiallyobservable,multiagent,stochastic,sequential,dynamic,continuous,unknownenvironments.

Example

0 Percepts:location(A,B),contents(dirty,clean).0 Actions:left,right,suck,idle.

Table‐driven0 Lowintelligence0 Highcomplexity

0 ThetaskofAIistoimproveonthiscomplexitymetric.

Simplereflexagent0 Nomemory0 Lowcomplexity:thenumberofperceptsforwhichareactionisdefined.

0 Condition‐actionrules

Model‐basedagent

Model‐basedagentInputstodeliberation:0 Currentpercepts0 State:modelorinternalrepresentation.0 Condition‐actionrules.0 Recentactions.

0 Thestateisupdatedbasedonpreviousstate,mostrecentaction,andpercept.

0 Theactionischosenbasedonstateandrules.

Goal‐basedagent

Utility‐basedagent

Utility‐basedagent0 Utilityfunction:internalizationoftheperformancemeasure.

0 Theactionischosenbasedonstate,goal,andcost.

Learningagent

Multiagent0 Cooperation0 Competition0 Swarmintelligence:performancemeasureappliedtocollectivebehavior.

0 Decentralizedrepresentation0 Emergentbehavior

0 Weakemergence:thequalitiesofthesystemarereducibletothesystem'sconstituentparts.

0 Strongemergence:e.g.qualia.0 Theconceptsofutilityandrationalitychange!

Prisoner’sdilemmaPrisonerBsilent PrisonerBbetray

Prisoner Asilent A:0.5,B:0.5 A:10,B:0Prisoner Abetray A:0,B:10 A:5,B:5

Twosuspectsarearrested.Ifonetestifiesagainsttheother(betray)andtheotherremainssilent,thebetrayergoesfreeandthesilentaccomplicereceivesthefull10‐yearsentence.Ifbothremainsilent,bothprisonersaresentencedtoonlysixmonthsforaminorcharge.Ifeachbetrays theother,eachreceivesa5‐yearsentence.Howshouldtheprisonersact?

• Nomatterwhattheotherplayerdoes,a playerwillalwaysgainagreaterpayoffbyplayingdefect.

• Sinceinany situationbetrayingismorebeneficialthanremainingsilent,all rationalplayerswillbetray.

Technology

Introduction to AI - Eight Lecture