Game Theory Studied by mathematicians, economists, finance In
AI, we limit game to deterministic turn-taking two-player zero-sum
perfect information This means deterministic, full observable
environments in which there are two agents whose action must
alternate and in in which the utility values at the end of the game
are always equal and opposite 3
Slide 4
Types of Games deterministicchance perfect information Chess,
Checkers , Go, Othello Backgammon imperfect informationBridge,
Poker Game playing was one of the first tasks undertaken in AI
Machines have surpassed humans on checker and Othello, have
defeated human champions in chess and backgammon In Go, computers
perform at the amateur level 4
Slide 5
Checkers 5
Slide 6
Game as Search Problems Games offer pure, abstract competition
A chess playing computer would be an existence proof of a machine
doing something generally thought to require intelligence Games are
idealization of worlds in which the world state is fully accessible
the (small number of) actions are well-defined uncertainty due to
moves by the opponent due to the complexity of games 6
Slide 7
Game as Search Problems (cont.-1) Games are usually much too
hard to solve Example, chess: average branching factor = 35 average
moves per player = 50 total number of nodes in search tree = 35 100
or 10 154 total number of different legal positions = 10 40 Time
limits for making good decisions Unlikely to find goal, must
approximate 7
Slide 8
Game as Search Problems (cont.-2) Initial State How does the
game start? Successor Function A list of legal (move, state) pairs
for each state Terminal Test Determine when game is over Utility
Function Provide numeric value for all terminal states e.g., win,
lose, draw with +1, -1, 0 8
Slide 9
Game Tree (2-player, deterministic, turns) 9 Game tree
complexity 9!=362880 Game board complexity 3 9 = 19683
Slide 10
Minimax Strategy Assumption Both players are knowledgeable and
play the best possible move MinimaxValue(n) = Utility(n) if n is a
terminal state max s Successors(n) MinimaxValue(s) if n is a MAX
node min s Successors(n) MinimaxValue(s) if n is a MIN node 10
Slide 11
Minimax Strategy (cont.) Is a Optimal Strategy Leads to
outcomes at least as good as any other strategy when playing an
infallible opponent Pick the option that most (max) minimizes the
damage your opponent can do maximize the worst-case outcome because
your skillful opponent will certainly find the most damaging move
11
Slide 12
Minimax Perfect play for deterministic, perfect information
games Idea: choose moves to a position with highest minimax value =
best achievable payoff against best play 12
Slide 13
13 Minimax Animated Example 5 1362270 Max Min Max 5 5 6 70 6 6
3 3 31 The computer can obtain 6 by choosing the right hand edge
from the first node.
Slide 14
Minimax Algorithm function M INIMAX -D ECISION ( state )
returns an action inputs: state, current state in game v M AX -V
ALUE ( state ) return the action in S UCCESSORS ( state ) with
value v function M AX -V ALUE ( state ) returns a utility value if
T ERMINAL -T EST ( state ) then return U TILITY ( state ) v for a,
s in S UCCESSORS ( state ) do v M AX ( v, M IN -V ALUE ( s ))
return v function M IN -V ALUE ( state ) returns a utility value if
T ERMINAL -T EST ( state ) then return U TILITY ( state ) v for a,
s in S UCCESSORS ( state ) do v M IN ( v, M AX -V ALUE ( s ))
return v 14
Slide 15
Optimal Decisions in Multiplayer Games Extend the minimax idea
to multiplayer games Replace the single value for each node with a
vector of values 15
Slide 16
Minimax Algorithm (cont.) Generate the whole game tree Apply
the utility function to each terminal state Propagate utility of
terminal states up one level Utility(n) = max / min (n.1, n.2, ,
n.b) At the root, MAX chooses the move leading to the highest
utility value 16
Slide 17
Analysis of Minimax Complete? Yes, only if the tree is finite
Optimal? Yes, against an optimal opponent Time? O(b m ), is a
complete depth-first search m: max depth, b: # of legal moves
Space? O(bm), generate all successors at once or O(m), generate
successor one at a time For chess, b 35, m 100 for reasonable games
Exact solution completely infeasible 17
Slide 18
Complex Games What happens if minimax is applied to large
complex games? What happens to the search space? Example, chess
Decent amateur program 1000 moves / second 150 seconds / move
(tournament play) Look at approx. 150,000 moves Chess branching
factor of 35 Generate trees that are 3-4 ply Resultant play pure
amateur 18
Slide 19
- Pruning The problem of minimax search # of states to examine:
exponential in number of moves - pruning return the same move as
minimax would, but prune away branches that cannot possibly
influence the final decision lower bound on MAX node, never
decreasing value of the best (highest) choice so far in search of
MAX upper bound on MAX node, never increasing value of the best
(lowest) choice so far in search of MIN 19
- Pruning (cont.) cut-off Search is discontinued below any MIN
node with min-value cut-off Search is discontinued below any MAX
node with max-value Order of considering successors matters (look
at step f in previous slide) If possible, consider best successors
first 22
Slide 23
- Pruning (cont.) If n is worse than , max will avoid it prune
the branch If m is better than n for player, we will never get to n
in play and just prune it max min max min 23
- Pruning Example - 3 NodeAlpha Beta a- b- d- d1 d2 d3 b- 3 e-
3 e4 3 CUT-OFF NodeAlpha Beta a3 c3 f3 c3 3 CUT-OFF Completed MAX
MIN MAX MIN 515112234706 a c b d e f g 25
Slide 26
FunctionNode VReturn MaxA-, 3+-, 33, 2 MinB-+, 3 3, 4
MaxD-,1,2,3+-,1,2,31,2,3 Min1-+ Min21+ Min32+ MaxE-3-, 44Cutoff 5
& 7 Min4-3 MinC3+, 6 6 MaxF3,4,5,6+-,4,5,64,5,6 Min43+ Min54+
Min65+ MaxG36-, 66Cutoff 1 & 5 Min636 Key: - = negative
infinity; + = positive infinity The last value in a square is the
final value assigned to the specific variable, i.e. at the end of
the search Node As = 3. 26
Slide 27
- Algorithm function A LPHA -B ETA -S EARCH ( state ) returns
an action inputs: state, current state in game v M AX -V ALUE (
state, , ) return the action in S UCCESSORS ( state ) with value v
function M AX -V ALUE ( state, , ) returns a utility value inputs:
state, current state in game , the value of the best alternative
for MAX along the path to state , the value of the best alternative
for MIN along the path to state if T ERMINAL -T EST ( state ) then
return U TILITY ( state ) v for a, s in S UCCESSORS ( state ) do v
M AX ( v, M IN -V ALUE ( s, , )) if v then return v// fail-high M
AX ( , v ) return v 27
Slide 28
- Algorithm (cont.) function M IN -V ALUE ( state, , ) returns
a utility value inputs: state, current state in game , the value of
the best alternative for MAX along the path to state , the value of
the best alternative for MIN along the path to state if T ERMINAL
-T EST ( state ) then return U TILITY ( state ) v for a, s in S
UCCESSORS ( state ) do v M IN ( v, M AX -V ALUE ( s, , )) if v then
return v// fail low M IN ( , v ) return v 28
Slide 29
- Pruning Example - 4 MAX MIN MAX MIN 5877 a cb defg 4251203012
MAX hijklmn 29
Slide 30
- Pruning Example - 5 30
Slide 31
Analysis of - Search Pruning does not affect final result The
effectiveness of - pruning is highly dependent on the order in
which the successors are examined It is worthwhile to try to
examine first the successors that are likely to be best e.g.,
Example 1 (e,f)Example 1 If successors of D is 2, 5, 14 (instead of
14, 5, 2) then 5, 14 can be pruned 31
Slide 32
Analysis of - Search (cont.) If best move first (perfect
ordering), the total number of nodes examined = O(b m/2 ) effective
branching factor = b 1/2 for chess, 6 instead 35 i.e., - can look
ahead roughly twice as far as minimax in the same amount of time If
random order, the total number of nodes examined = O(b 3m/4 ) for
moderate b 32
Slide 33
Imperfect, Real-Time Decisions No practical to assume the
program has time to search all the ways to terminal states Since
moves must be made in a reasonable amount of time, to alter minimax
or - in two ways Evaluation Function (instead of utility function)
an estimate of the expected utility of game from a given position
Cutoff Test (instead of terminal test) decide when to apply Eval
e.g., depth limit (perhaps add quiescence search) 33
Slide 34
Evaluation Functions The heuristic that estimates expected
utility Preserve the ordering among terminal states in the same way
as the true utility function, otherwise it can cause bad decision
making Computation cannot take too long For nonterminal states, it
should be strongly correlated with the actual chances of winning
Define features of game state that assist in evaluation What are
features of chess? e.g., # of pawns possessed, etc. Weighted Linear
Function Eval(s) = w 1 f 1 (s) + w 2 f 2 (s) + + w n f n (s)
34
Slide 35
Evaluation Functions (cont.-1) (a)Black has an advantage of a
knight and two pawns and will win the game (b)Black will lose after
white captures the queen 35
Slide 36
Evaluation Functions (cont.-2) Digression: Exact values dont
matter Behavior is preserved under any monotonic transformation of
Eval Only the order matter payoff in deterministic games acts as an
order utility function 36
Slide 37
Cutting off Search When do you use evaluation functions? if
Cutoff-Test(state, depth) then return Eval(state) controlling the
amount of search is to set a fixed depth limit d Cutoff-Test(state,
depth) returns 1 or 0 when 1 is returned for all depth greater than
some fixed depth d, use evaluation function cutoff beyond a certain
depth cutoff if state is stable (more predictable) cutoff moves you
know are bad (forward pruning) Can have disastrous effect if
evaluation functions are not sophisticated enough Should continue
the search until a quiescent position is found 37
Slide 38
Cutting off Search (cont.) Does it work in practice? b m = 10
6, b = 35 m = 4 4-ply lookahead is a hopeless chess player 4-ply
human novice 8-ply typical PC, human master 12-ply Deep Blue,
Kasparov 38
Slide 39
Horizontal Effect a series of checks by the black rook forces
the inevitable queening move by white over the horizontal and makes
the position look like a win for black, when it is really a win for
white Horizontal effect arises when the program is facing a move by
the opponent that causes serious damage and is ultimately
unavoidable At present, no general solution has been found for
horizontal problem 39
Slide 40
Suggestion Improve evaluation function Know that the bishop is
trapped Make the search deeper Make the search depth more flexible
Program searches deeper in the line that a pawn is being given
away, and less deep in other lines 40
Slide 41
HW2, Deadline 4/12 41 Design the Evaluation Functions for
Chinese chess and Chinese Dark chess.