Autonomous AI Steve Hookway 9/13/04. Outline Constraining Autonomous Behavior (3.1) Simple...

Preview:

Citation preview

Autonomous AI

Steve Hookway9/13/04

Outline Constraining Autonomous Behavior

(3.1) Simple Techniques for Coordinated

Behavior (3.2) Synthetic Adversaries for Urban

Combat Training (Laird et al.)

No One Lives Forever 2 UNITY agent Cate

Archer fights the H.A.R.M. Worldwide crime syndicate

Use a combination of stealth and brute force to work through each level

No One Lives Forever 2 Excellent reviews for both game-play

and AIGames Domain (5/5) "NOLF 2 is extremely hard to fault. Its

presentation is top-notch, its levels are varied and interesting, its AI is convincing, and the whole game drips with love and attention to detail. In fact, it's hard to remember a single-player FPS we enjoyed this much since Deus Ex ... Any first-person shooter fan should consider this an essential purchase."

HomeLan Fed (94%) "one of the best single player first person shooter games ever made. With solid AI, great looking levels and graphics, fun weapons and gadgets, excellent voice acting, and a healthy dose of humor, A Spy In H.A.R.M’s way is much better than the original and that’s saying something"

Ideal vs Actual Behavior Ideally completely autonomous AI

Goal based system Tagged objects

Realistically human behaviors need to be added Total autonomy does not necessarily

lead to believable behavior

Human Characteristics Ownership Dependency Responsibility Relevance Priority State of

Consciousness Expected State Presence of Others

Ownership Without constraint all objects of

interest are up for grabs Implement ownership by grouping

objects into sets Each character has his own set of

objects “Is this Object mine to use?”

Dependency The use of one object depends on

the use of another object Washing hands after using bathroom

Can’t just make one inseparable behavior

Enable, disable objects for dependency

“Is there anything I need to do before using this object?”

Responsibility Responsibility as a function of location

Keeps player from being overwhelmed by enemies

Responsibility as a function of the class of character Set which objects are available to be

used by that class of character “What is my purpose in life?”

Relevance A minor character’s goals should

not have a major influence on the world!

Provide a ceiling to the amount of influence an AI character can have over their surroundings

“How will my actions affect the player’s experience?”

Priority Character presented with multiple

opportunities simultaneously 3 categories of behavior

Aggressive > investigative > relaxed Relative priorities within categories “What is most important right

now?”

State of Consciousness Alive, Sleeping, Unconscious, Dead Players emit stimuli depending on

state Can act according to previous or

current state “Am I awake?”

Expected State Consciousness is an example of

general concept State Objects have associated state

Notice when an object is out of place Notice when a player or object’s state

has changed “Has anything changed?”

Presence of Others Behavior depends on who is

present Characters need to take each

other into account “How will this action affect other

characters?”

Outline Constraining Autonomous Behavior

(3.1) Simple Techniques for Coordinated

Behavior (3.2) Synthetic Adversaries for Urban

Combat Training (Laird et al.)

Coordinate Behavior Split Up Get out of my way Get down What’s going on

“Split Up” Agents are driven by same

logic and therefore behave in the same way upon receiving a stimulus

Same shortest path algorithm Nearby agents follow same

path Crowding!

Split up cont. Path of waypoints to target Reservations

Shortest path as least costly path A node is not released until agent

completes path Reserve the node BEFORE the

destination Different paths give illusion of

coordination

Pathfinding

E

A B

1

1

1

1

1

E

A B

1

1

1

1

50

Reserved

Maximum Occupancy The pathfinding algorithm

works well in combat Firing range prevents crowding

When investigating, crowding becomes a problem First agent should reserve

object under investigation Other agents maintain distance

Blackboard Communication Post and query public information Avoids overhead in adding data to

actual pathfinding data structures themselves

Makes code more maintainable

A Record on the Blackboard

Struct BlackboardRecord{

BBRecordType eType;ObjectID idPoster;ObjectID idTarget;Int nData;

};

The Blackboard …Class Cblackboard{public:

void PostBBRecord(BBRecordType eType, ObjectID idPoster, ObjectID idTarget, int nData);void RemoveBBRecord(BBRecordType eType, ObjectID idPoster);void RemoveAllBBRecords(BBRecordType eType);int CountBBRecords(BBRecordType eType, ObjectID idTarget);BlackboardRecord* GetBBRecord(BBRecordType eType, ObjectID idTarget);

protected:BlackboardRecord* m_aBBRecords[MAX_BB];

};

Agent path planning Clear any previously reserved

volumes and reserve volume before destination:g_pBlackboard ->

RemoveBBRecord(kBB_ReservedVolume, myID);

g_pBlackboard -> PostBBRecord(kBB_ReservedVolume, myID, volumeID,NULL);

Agent path planning cont. Check if volume is reserved and

apply cost penalty if it is

float fCost = pVolume -> fCost;if(g_pBlackboard-

>GetBBRecord(kBB_ReservedVolume, volumeID)

{fCost *= 50.f;

}

“Get Out of My Way” It is inevitable that agents will get

in each other’s way Too much overhead to recalculate

optimal position Better to deal gracefully with

obstruction

Get out of My Way cont. The obstructed agent can do

nothing The obstructed agent can fire

anyway, either killing or going through his ally

The obstructed agent can move The obstructed agent’s ally can

move

Implementation Obviously options 3 and 4 are the

way to go Requests ally moves If ally is busy (reloading) agent

moves Leads to a dynamic environment

with everyone in motion looking for a clear shot

Implementation cont. Previous method works well with

ranged weapons but not well with close-ranged melee weapons

Resort to KungFu style

“Get Down!” Agents have a limited number of actions We don’t want all agents take same

action simultaneously Use blackboard to post actions

Agent posts action and time Before taking action agent checks to see if

ally has done same action within a threshold time

“What’s Going On?” We don’t want agents to stand idol

while a nearby ally is taking damage

The same sensory system that detects enemies can be used to detect each other

When agent detects danger or takes damage, it emits a signal to any other agent in earshot

Coordination

We have achieved simple coordination without the addition of a group behavior layer!

Outline Constraining Autonomous Behavior

(3.1) Simple Techniques for Coordinated

Behavior (3.2) Synthetic Adversaries for Urban

Combat Training (Laird et al.)

Synthetic Adversaries Virtual training of soldiers for

Military Operations on Urbanized Terrain (MOUT)

System based on a bot developed to play Quake

Uses Infiltration’s (UT’s) environment

Requirements for Synthetic Adversaries

Competence Taskability Observational fidelity Behavior variability Transparency Minimal computational footprint

Competence Must perform the actions and

behaviors humans perform in this domain

Many different roles Defending a room Sentry Acting as a leader

Focus on a small team interacting

Competence Threat data structure

Used to maintain awareness about enemy

Building map data structure Mission object Operators are used to take action

in the world

Taskability Must be able to be assigned new

missions and change objectives during an exercise

All agent knowledge is independent of the specific mission

MOUTBots can issue a limited amount of commands to change the mission of other bots

Observational Fidelity Model human behavior only in

those aspects which can be observed

This is one place we can limit computational time

Subject to making the some “human errors”

Behavior Variability Must not be predictable Allows for soldiers to train in the

same situation but experience a new encounter

Limits the amount of surprises in real life combat

Assign probability to each potential action

Transparency Allow after action review by keeping

a record explaining actions Used an application specific

visualization tool to show what the agent believes the status of the map to be

Can look at an agent’s goal; “defend room” goal among agents active goals

Minimal Computational Footprint

Targeted for personal computers Majority of processing reserved for

graphics and physics modeling Have been able to run as many as

20 bots in UT while keeping the game’s frame rate

Conclusion Fully autonomous AI is a long way

off, but it is still possible to model human behavior

BONUS MATERIAL

Creature Smarts

Is it possible to build an artificial intelligence that’s as clever, as adaptive and as captivating as the average dog?C4 and Sheep|DogDuncan the Terrier

Autonomous AI Build a creature that can interact

with each other and human participants

Self sufficient reasoning Learning Action Selection Goals

System Architecture

World Model

Creature 1 Mental Rep

Creature 2 Mental Rep

Local Devices Networked Devices

Input Server

Renderers World State

Cognitive Architecture

Sensory System

Perception System

Attention Selection

Action Selection

Navigation System

Motor System

Proprioceptive System

Action System

Object of Attention

Motor Desired

Motor Actual

Internal Blackboard

Working Memory

The World

Sensory System Sensory honesty Input from world (creatures point

of view)

Perception System Percept-perceive stimulus Classification hierarchy

Everything

Head Location

SoundWorld Location

Shape

Humanoid SheepUtterance

Down Sit Away

Percept and Memory After classifying, hold onto

percepts and place like percepts together

Allows related events to be matched

Why bother with percepts? Causes honest mistakes

Believable behavior Learning generalizable concepts

Find rules that apply to many objects Subjective experience

How stimulus is perceived depends on creature

Action System Any representation of action

should answer the big 4 When to do it? What to do and how to do it? What to do it to? For how long?

Action Tuples Trigger Context Action Object Context Do Until Context

Trigger Context

Action Object Context

DoUntilContext“Away”

Sound Percept

Circle Counterclockwi

se

Sheep Shape Percept

~15s, or circled flock

Action Group Formed of similar action tuples Attention Group

Chooses creatures focus Primary Action Group

Determines large-scale body motion

Learning in c4’s Action System Events perceived to have intrinsic

values on a nominal scale Eating a cookie has a high positive value Meeting a lion a large negative number

The actions that lead to these events become weighted (credit assignment) Learning reliable action pairs

Learning cannot happen without variation of action and state

Navigation System Navigation overrides action system

Must move to achieve action Relieves action system from

handling movement Eat vs Approach and eat

Motor System No such thing as just a walk cycle

Happy walk Sad walk And so on…

Simple gestures Locomotion Eye/head/body orientation

Doing it in Style Parameterized motor actions

Different levels (heights of paw for shake)

Support new motor actions Characters increase in complexity

over time Create new animations “live”

Allow characters to learn new animations during their lives

Pose Graph Action between poses Movement between poses

Gallop Walk Stalk

Skid Stand Beg

Shake Sit Crouch

Down

Conclusions Strict enforcement of sensory honesty

Creature must make decisions based on percept of object, rather than knowledge of object state

Creatures must perceive each other and interact

Leads to a more realistic system of learning and honest mistakes

Recommended