Upload
erik-ambrose-powers
View
223
Download
0
Tags:
Embed Size (px)
Citation preview
NEURAL NETWORKS
for games
NN: APPLICATIONS1. Control
Controllers for robotic applications. Robot’s sensory system provides inputs and
output sends the responses to the robot’s motor control system.
How about in games?
NN: APPLICATIONS2. Threat Assessment
Strategy/simulation type game Use NN to predict the type of threat
presented by the player at any time during gameplay.
NN: APPLICATIONS3. Attack or Flee
RPG – to control how certain AI creatures behave
Handle AI creature’s decision making
NEURAL NETWORKS 101 Using a 3-layer feed-forward neural
network as example Structure
Input Hidden Output : Feed-forward process
NEURAL NETWORKS 101 Input
What to choose as input is problem-specificKeeping inputs to a minimum set will make
training easier. Forms: Boolean, enumerated, continuousThe different scales used require the input
values to be normalized
NEURAL NETWORKS 101 Weights
“Synaptic connection” in a biological NNWeights influence the strength of inputsDetermining the weights involve “training” or
“evolving” a NNEvery connection between neurons has an
associated weight – net input to a given neuron j is calculated from a set of input i neurons
Net input to a given neuron is a linear combination of weighted inputs from other neurons from previous layer
NEURAL NETWORKS 101 Activation Functions
Takes the net input to a neuron, operates on it to produce an output for the neuron
Should be nonlinear functions, for NN to work as expected
Common: Logistic (or sigmoid) function
NEURAL NETWORKS 101 Activation Functions (cont’d)
Other well-known activation functions: Step function and hyperbolic tangent function
NEURAL NETWORKS 101 Bias
Each neuron (except from input layer) has a bias associated to it
Bias term shifts net input along horizontal axis of activation function, changing the threshold it activates
Value: always 1 or -1 Its weight also adjusted just like other
weights
NEURAL NETWORKS 101 Output
Choice is also problem-specific Same rule of thumb: Keep number to minimum Using a logistic function as output activation,
an output around 0.9 is considered activated or true, 0.1 considered not activated or false
In practice, we may not even get close to these values! So, a threshold has to be set… Using midpoint of the function (0.5) is a simple choice
If more than one output neuron is used, more than one outputs could be activated – easier to select just one output by “winner-take-all” approach
NEURAL NETWORKS 101 Hidden Layer
Some NNs have no hidden layers or a few hidden layers – design
The more hidden layers, the more features the network can handle, and vice versa
Increasing the number of features (dimensionality) can enable better fit to the expected function
NEURAL NETWORKS 101 Back-propagation Training
Aim of training – to find values for the weights that connect the neurons such that the input data can generate the desired output values
Need a training setDone iterativelyOptimization process – requires some
measure of merit: Error measure that needs to be minimize
Error measures: Mean square error
NEURAL NETWORKS 101 Finding optimum weights iteratively
1. Start with training set consisting of input data and desired outputs
2. Initialize the weights in the NN to some small random values
3. With each set of input data, feed network and calculate output
4. Compare calculated output with desired output, compute error
5. Adjust weights to reduce error, repeat process
Each iteration is known as an “epoch”
NEURAL NETWORKS 101 Computing error
Most common error measure: Mean square error, or average of the square of difference between desired and calculated output:
Goal: To get the error value as small as possible
Iteratively adjust the weight values, by calculating error associated with each neuron in output and hidden layers
NEURAL NETWORKS 101 Computing error (cont’d)
Output neuron error
Hidden-layer neuron error
No error is associated with input layer neurons because those neuron values are given
Can you observe how back-propagation is at work?
NEURAL NETWORKS 101 Adjusting weights
Calculate suitable adjustments for each weight in the network.
Adjustment to each weight:
New weight = Old weight + w Adjustments are made for each individual
weightThe learning rate p is a multiplier that affects
how much each weight is adjusted.
NEURAL NETWORKS 101 Adjusting weights (cont’d)
Setting p too high, might overshoot the optimum weights
Setting p too low, training might take too long
Special technique Adding “momentum” (see textbook), or regularization (another technique)
APPLICATION: CHASING AND EVADING Earlier example
Flocking and Chasing – A flock of units chasing a player
Applying neural networksTo decide whether to chase the player,
evade him, or flock with other AI unitsSimplistic method: Creature always attack
player, OR use a FSM “brain” (or other decision-making method) to decide between those actions based on conditions
APPLICATION: CHASING AND EVADING Neural Networks:
Advantage: Not only for making decisions but to adapt their behavior given their experience with attacking the player
A “feedback” mechanism is useful to model “experience”, so that subsequent decisions can be improved or made “smarter”.
APPLICATION: CHASING AND EVADING How it works (example)
Assume we have 20 AI units moving on the screen
Behaviors: Chase, Evade, Flock with other units Combat mode
When player and AI units come within a specified radius of one another, assume to be in combat
Combat will not be simulated – but use a simple system whereby AI units will lose a number of HP every turn through the game loop
Player also loses a number of HP proportional to number of AI units
A unit dies when HP = 0, and is respawned
APPLICATION: CHASING AND EVADING “Brain”
All AI units share the same “brain”The brain evolves as the unit gains
experience with the player Implement back-propagation so that the NN’s
weights can be adjusted in real timeAssume all AI units evolve collectively
ExpectationsAI become more aggressive if player is weakAI become more withdrawn if player is strongAI learns to stay in flock to have better
chance of defeating player
APPLICATION: INITIALIZATION AND TRAINING Initialize values for neural network
Number of neurons in each layer – 4 inputs, 3 hidden neurons, 3 output neurons
APPLICATION: INITIALIZATION AND TRAINING
Preparation for training Initialize learning rate to 0.2 – tuned by trial-
and-error with the aim of keeping the training time down while maintaining accuracy
Data is dumped into a text file so that it can be referred during debugging
Training loop – cycle through until… Calculated error is less than some specified value,
OR Number of iterations reach a specified maximum
APPLICATION: INITIALIZATION AND TRAINING
Sample training data for NN
double TrainingSet[14][7] = {
//#Friends, Hit points, Enemy Engaged, Range, Chase, Flock, Evade
0, 1, 0, 0.2, 0.9, 0.1, 0.1,
0, 1, 1, 0.2, 0.9, 0.1, 0.1,
0, 1, 0, 0.8, 0.1, 0.1, 0.1,
. . . .
14 sets of input and output valuesAll data values are within range from 0.0-1.0,
normalizedUse 0.1 for inactive (false) output and 0.9 for
active (true) output – impractical to achieve 0 or 1 for NN output, so use reasonable target value
APPLICATION: INITIALIZATION AND TRAINING
Training data was chosen empiricallyAssume a few arbitrary input conditions and
then specified a reasonable response. In practice, you are likely to design more
training sets than what was shown in example Training loop
Error initialize to 0, can calculated for each ‘epoch’ (once thru all 14 sets of inputs/outputs)
For each set of data, 1. Feed-forward performed2. Error calculated and accumulated3. Back-propagation to adjust connection weights
Average error calculated (divide by 14)
APPLICATION: LEARNING Updating AI Units – cycle thru all
Calculate distance from the current unit to target
Check if target is killed. If it is, then check where current unit is in relation to target (if it is in the combat range). If it is, retrain NN to reinforce chase behavior (unit doing something right, so train it to be more aggressive). Otherwise, retraining NN will reinforce other behaviors.
APPLICATION: USING THE NN Use the trained NN for real-time decision-
makingUnder the current set of conditions in real-
time, output will show which behavior the unit should take
REMEMBER: Input values have to be consistently normalized as well before feeding thru NN!
Feed-forward is appliedOutput values are then examined to derive
the proper choice of behavior Simple way – just select output with highest
activation
APPLICATION: OUTCOME
Some outcomes of this AI: If target is left to die without inflicting
much damage on the units AI units will adapt to attack more often (target perceived as weak)
If target inflicts massive damage AI units will adapt to avoid target more (target perceived as strong)
AI units also adapt to flock together if they are faced with strong target
APPLICATION: OUTCOME
Some outcomes of this AI: Interesting emergent behavior Leaders
emerge in flocks, intermediate and trailing units will follow the lead. Q: How is it possible to design such behaviors??
QUESTIONS?