For games. 1. Control Controllers for robotic applications. Robot’s sensory system provides inputs and output sends the responses to the robot’s motor

NEURAL NETWORKS

for games

NN: APPLICATIONS1. Control

Controllers for robotic applications. Robot’s sensory system provides inputs and

output sends the responses to the robot’s motor control system.

How about in games?

NN: APPLICATIONS2. Threat Assessment

Strategy/simulation type game Use NN to predict the type of threat

presented by the player at any time during gameplay.

NN: APPLICATIONS3. Attack or Flee

RPG – to control how certain AI creatures behave

Handle AI creature’s decision making

NEURAL NETWORKS 101 Using a 3-layer feed-forward neural

network as example Structure

Input Hidden Output : Feed-forward process

NEURAL NETWORKS 101 Input

What to choose as input is problem-specificKeeping inputs to a minimum set will make

training easier. Forms: Boolean, enumerated, continuousThe different scales used require the input

values to be normalized

NEURAL NETWORKS 101 Weights

“Synaptic connection” in a biological NNWeights influence the strength of inputsDetermining the weights involve “training” or

“evolving” a NNEvery connection between neurons has an

associated weight – net input to a given neuron j is calculated from a set of input i neurons

Net input to a given neuron is a linear combination of weighted inputs from other neurons from previous layer

NEURAL NETWORKS 101 Activation Functions

Takes the net input to a neuron, operates on it to produce an output for the neuron

Should be nonlinear functions, for NN to work as expected

Common: Logistic (or sigmoid) function

NEURAL NETWORKS 101 Activation Functions (cont’d)

Other well-known activation functions: Step function and hyperbolic tangent function

NEURAL NETWORKS 101 Bias

Each neuron (except from input layer) has a bias associated to it

Bias term shifts net input along horizontal axis of activation function, changing the threshold it activates

Value: always 1 or -1 Its weight also adjusted just like other

weights

NEURAL NETWORKS 101 Output

Choice is also problem-specific Same rule of thumb: Keep number to minimum Using a logistic function as output activation,

an output around 0.9 is considered activated or true, 0.1 considered not activated or false

In practice, we may not even get close to these values! So, a threshold has to be set… Using midpoint of the function (0.5) is a simple choice

If more than one output neuron is used, more than one outputs could be activated – easier to select just one output by “winner-take-all” approach

NEURAL NETWORKS 101 Hidden Layer

Some NNs have no hidden layers or a few hidden layers – design

The more hidden layers, the more features the network can handle, and vice versa

Increasing the number of features (dimensionality) can enable better fit to the expected function

NEURAL NETWORKS 101 Back-propagation Training

Aim of training – to find values for the weights that connect the neurons such that the input data can generate the desired output values

Need a training setDone iterativelyOptimization process – requires some

measure of merit: Error measure that needs to be minimize

Error measures: Mean square error

NEURAL NETWORKS 101 Finding optimum weights iteratively

1. Start with training set consisting of input data and desired outputs

2. Initialize the weights in the NN to some small random values

3. With each set of input data, feed network and calculate output

4. Compare calculated output with desired output, compute error

5. Adjust weights to reduce error, repeat process

Each iteration is known as an “epoch”

NEURAL NETWORKS 101 Computing error

Most common error measure: Mean square error, or average of the square of difference between desired and calculated output:

Goal: To get the error value as small as possible

Iteratively adjust the weight values, by calculating error associated with each neuron in output and hidden layers

NEURAL NETWORKS 101 Computing error (cont’d)

Output neuron error

Hidden-layer neuron error

No error is associated with input layer neurons because those neuron values are given

Can you observe how back-propagation is at work?

NEURAL NETWORKS 101 Adjusting weights

Calculate suitable adjustments for each weight in the network.

Adjustment to each weight:

New weight = Old weight + w Adjustments are made for each individual

weightThe learning rate p is a multiplier that affects

how much each weight is adjusted.

NEURAL NETWORKS 101 Adjusting weights (cont’d)

Setting p too high, might overshoot the optimum weights

Setting p too low, training might take too long

Special technique Adding “momentum” (see textbook), or regularization (another technique)

APPLICATION: CHASING AND EVADING Earlier example

Flocking and Chasing – A flock of units chasing a player

Applying neural networksTo decide whether to chase the player,

evade him, or flock with other AI unitsSimplistic method: Creature always attack

player, OR use a FSM “brain” (or other decision-making method) to decide between those actions based on conditions

APPLICATION: CHASING AND EVADING Neural Networks:

Advantage: Not only for making decisions but to adapt their behavior given their experience with attacking the player

A “feedback” mechanism is useful to model “experience”, so that subsequent decisions can be improved or made “smarter”.

APPLICATION: CHASING AND EVADING How it works (example)

Assume we have 20 AI units moving on the screen

Behaviors: Chase, Evade, Flock with other units Combat mode

When player and AI units come within a specified radius of one another, assume to be in combat

Combat will not be simulated – but use a simple system whereby AI units will lose a number of HP every turn through the game loop

Player also loses a number of HP proportional to number of AI units

A unit dies when HP = 0, and is respawned

APPLICATION: CHASING AND EVADING “Brain”

All AI units share the same “brain”The brain evolves as the unit gains

experience with the player Implement back-propagation so that the NN’s

weights can be adjusted in real timeAssume all AI units evolve collectively

ExpectationsAI become more aggressive if player is weakAI become more withdrawn if player is strongAI learns to stay in flock to have better

chance of defeating player

APPLICATION: INITIALIZATION AND TRAINING Initialize values for neural network

Number of neurons in each layer – 4 inputs, 3 hidden neurons, 3 output neurons

APPLICATION: INITIALIZATION AND TRAINING

Preparation for training Initialize learning rate to 0.2 – tuned by trial-

and-error with the aim of keeping the training time down while maintaining accuracy

Data is dumped into a text file so that it can be referred during debugging

Training loop – cycle through until… Calculated error is less than some specified value,

OR Number of iterations reach a specified maximum


Sample training data for NN

double TrainingSet[14][7] = {

//#Friends, Hit points, Enemy Engaged, Range, Chase, Flock, Evade

0, 1, 0, 0.2, 0.9, 0.1, 0.1,

0, 1, 1, 0.2, 0.9, 0.1, 0.1,

0, 1, 0, 0.8, 0.1, 0.1, 0.1,

. . . .

14 sets of input and output valuesAll data values are within range from 0.0-1.0,

normalizedUse 0.1 for inactive (false) output and 0.9 for

active (true) output – impractical to achieve 0 or 1 for NN output, so use reasonable target value


Training data was chosen empiricallyAssume a few arbitrary input conditions and

then specified a reasonable response. In practice, you are likely to design more

training sets than what was shown in example Training loop

Error initialize to 0, can calculated for each ‘epoch’ (once thru all 14 sets of inputs/outputs)

For each set of data, 1. Feed-forward performed2. Error calculated and accumulated3. Back-propagation to adjust connection weights

Average error calculated (divide by 14)

APPLICATION: LEARNING Updating AI Units – cycle thru all

Calculate distance from the current unit to target

Check if target is killed. If it is, then check where current unit is in relation to target (if it is in the combat range). If it is, retrain NN to reinforce chase behavior (unit doing something right, so train it to be more aggressive). Otherwise, retraining NN will reinforce other behaviors.

APPLICATION: USING THE NN Use the trained NN for real-time decision-

makingUnder the current set of conditions in real-

time, output will show which behavior the unit should take

REMEMBER: Input values have to be consistently normalized as well before feeding thru NN!

Feed-forward is appliedOutput values are then examined to derive

the proper choice of behavior Simple way – just select output with highest

activation

APPLICATION: OUTCOME

Some outcomes of this AI: If target is left to die without inflicting

much damage on the units AI units will adapt to attack more often (target perceived as weak)

If target inflicts massive damage AI units will adapt to avoid target more (target perceived as strong)

AI units also adapt to flock together if they are faced with strong target

APPLICATION: OUTCOME

Some outcomes of this AI: Interesting emergent behavior Leaders

emerge in flocks, intermediate and trailing units will follow the lead. Q: How is it possible to design such behaviors??

QUESTIONS?

Documents

For games. 1. Control Controllers for robotic applications. Robot’s sensory system provides inputs and output sends the responses to the robot’s motor