17
Introduction to AI 8 th Lecture 1990’s – Agents Wouter Beek [email protected] 10 November 2010

Introduction to AI - Eight Lecture

Embed Size (px)

Citation preview

Page 1: Introduction to AI - Eight Lecture

IntroductiontoAI8th Lecture

1990’s– Agents

[email protected]

Page 2: Introduction to AI - Eight Lecture

Origin0 Rationalagentconceptfromeconomy.0 Utilitytheory:thetheoryofpreferredoutcomes.0 Decisiontheory:thedynamicsofutilitymaximizationinanunpredictableenvironment.

0 Gametheory:thedynamicsofutilitymaximizationwhenparticipantsaffecteachother’sutilityinapredictableway.

Page 3: Introduction to AI - Eight Lecture

Agent0 Agent:

0 Perceivetheenvironmentthroughsensors.0 Actontheenvironmentthroughactuators.0 Theenvironmentcanbenon‐physical.

0 Percept:thesetofperceptionsatsomepointintime.0 Perceptsequence:thesetofaperception‐timepairs.0 Agentfunction:perceptsequence action0 Agentprogram:animplementationofanagentfunction.

0 Agentarchitecture

Page 4: Introduction to AI - Eight Lecture

Rationality0 Arationalbeingconsidersalltheconsequencesofallpossibleactions,andmakestheseconsequencespartofthedecisionprocessesforperformingeachofthoseactions.

0 Givenanenvironmentandaperceptsequence,whatisthe‘best’thingtodo?

0 Performancemeasure:objectiveassessmentofthevalueofsuccessofanarbitraryenvironmentsequence.

Page 5: Introduction to AI - Eight Lecture

RationalagentDependentvariables:1. Priorknowledgeoftheagent.2. Performancemeasureofenvironmentstate

sequence.3. Possibleactionstheagentcanperform.4. Perceptsequenceoftheagent.

0 Informationgathering:performing(3)inordertoenrich(4)andtherebyincrease(1).

0 Learning:increase(1)through(4).0 Autonomy:allof(1)relatesbackto(4).

Page 6: Introduction to AI - Eight Lecture

Taskenvironment0 Fully/partiallyobservable0 Single/multiagent(competitive/cooperative)0 Deterministic/stochastic0 Episodic/sequential0 Static/dynamic/semidynamic0 Discrete/continuous0 Known/unknown

0 Blocksworld:fullyobservable,singleagent,deterministic,episodic,static,knownenvironment.

0 1990’s:partiallyobservable,multiagent,stochastic,sequential,dynamic,continuous,unknownenvironments.

Page 7: Introduction to AI - Eight Lecture

Example

0 Percepts:location(A,B),contents(dirty,clean).0 Actions:left,right,suck,idle.

Page 8: Introduction to AI - Eight Lecture

Table‐driven0 Lowintelligence0 Highcomplexity

0 ThetaskofAIistoimproveonthiscomplexitymetric.

Page 9: Introduction to AI - Eight Lecture

Simplereflexagent0 Nomemory0 Lowcomplexity:thenumberofperceptsforwhichareactionisdefined.

0 Condition‐actionrules

Page 10: Introduction to AI - Eight Lecture

Model‐basedagent

Page 11: Introduction to AI - Eight Lecture

Model‐basedagentInputstodeliberation:0 Currentpercepts0 State:modelorinternalrepresentation.0 Condition‐actionrules.0 Recentactions.

0 Thestateisupdatedbasedonpreviousstate,mostrecentaction,andpercept.

0 Theactionischosenbasedonstateandrules.

Page 12: Introduction to AI - Eight Lecture

Goal‐basedagent

Page 13: Introduction to AI - Eight Lecture

Utility‐basedagent

Page 14: Introduction to AI - Eight Lecture

Utility‐basedagent0 Utilityfunction:internalizationoftheperformancemeasure.

0 Theactionischosenbasedonstate,goal,andcost.

Page 15: Introduction to AI - Eight Lecture

Learningagent

Page 16: Introduction to AI - Eight Lecture

Multiagent0 Cooperation0 Competition0 Swarmintelligence:performancemeasureappliedtocollectivebehavior.

0 Decentralizedrepresentation0 Emergentbehavior

0 Weakemergence:thequalitiesofthesystemarereducibletothesystem'sconstituentparts.

0 Strongemergence:e.g.qualia.0 Theconceptsofutilityandrationalitychange!

Page 17: Introduction to AI - Eight Lecture

Prisoner’sdilemmaPrisonerBsilent PrisonerBbetray

Prisoner Asilent A:0.5,B:0.5 A:10,B:0Prisoner Abetray A:0,B:10 A:5,B:5

Twosuspectsarearrested.Ifonetestifiesagainsttheother(betray)andtheotherremainssilent,thebetrayergoesfreeandthesilentaccomplicereceivesthefull10‐yearsentence.Ifbothremainsilent,bothprisonersaresentencedtoonlysixmonthsforaminorcharge.Ifeachbetrays theother,eachreceivesa5‐yearsentence.Howshouldtheprisonersact?

• Nomatterwhattheotherplayerdoes,a playerwillalwaysgainagreaterpayoffbyplayingdefect.

• Sinceinany situationbetrayingismorebeneficialthanremainingsilent,all rationalplayerswillbetray.