Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games

Game PlayingGame PlayingRevisionRevision

Mini-Max searchMini-Max search

Alpha-Beta pruningAlpha-Beta pruning

General concerns on gamesGeneral concerns on games

2

Why study board games ?Why study board games ?

One of the oldest subfields of AI (Shannon and One of the oldest subfields of AI (Shannon and

Turing, 1950)Turing, 1950)

Abstract and pure form of competition that Abstract and pure form of competition that

seems to require intelligenceseems to require intelligence

Easy to represent the states and actionsEasy to represent the states and actions

Very little world knowledge required !Very little world knowledge required !

Game playing is a special case of a search Game playing is a special case of a search

problem, with some new requirements.problem, with some new requirements.

3

Types of gamesTypes of games

Bridge, poker, Bridge, poker, scrabble, scrabble, nuclear warnuclear war

Backgammon, Backgammon, monopolymonopoly

Chess, Chess, checkers, go, checkers, go, othelloothello

ChanceChanceDeterministicDeterministic

Imperfect Imperfect informationinformation

Perfect Perfect informationinformation

Sea battleSea battle

4

Why new techniques for games?Why new techniques for games?

““Contingency” problem:Contingency” problem:

We don’t know the opponents move !We don’t know the opponents move ! The size of the search space:The size of the search space:

Chess : ~15 moves possible per state, 80 ply Chess : ~15 moves possible per state, 80 ply 15158080 nodes in tree nodes in tree

Go : ~200 moves per state, 300 plyGo : ~200 moves per state, 300 ply 200200300300 nodes in tree nodes in tree

Game playing algorithms:Game playing algorithms: Search tree only up to some Search tree only up to some depth bounddepth bound Use an Use an evaluation functionevaluation function at the depth boundat the depth bound PropagatePropagate the evaluation the evaluation upwardsupwards in the tree in the tree

5

MINI MAXMINI MAX Restrictions:Restrictions:

2 players:2 players: MAXMAX (computer)(computer) andand MIN MIN (opponent)(opponent) deterministic, perfect informationdeterministic, perfect information

Select a depth-bound Select a depth-bound (say: 2)(say: 2) and evaluation and evaluation functionfunction

MAXMAX

MINMIN

MAXMAX

- - Construct the tree up tillConstruct the tree up till the depth-boundthe depth-bound

- - Compute the evaluation Compute the evaluation function for the leavesfunction for the leaves

22 55 33 11 44 44 33

- - Propagate the evaluationPropagate the evaluation function upwards:function upwards: - taking minima intaking minima in MINMIN

22 11 33

- - taking maxima intaking maxima in MAXMAX

33SelectSelectthis movethis move

6

The MINI-MAX algorithm:The MINI-MAX algorithm:Initialise Initialise depthbounddepthbound;;

Minimax (Minimax (boardboard, , depthdepth) = ) =

IFIF depthdepth == depthbounddepthbound THENTHEN returnreturn static_evaluation(static_evaluation(boardboard));; ELSEELSE IFIF maximizing_level( maximizing_level(depthdepth) )

THENTHEN FOR EACH FOR EACH child child childchild of of boardboard compute Minimax(compute Minimax(childchild, ,

depth+1depth+1);); returnreturn maximum over all maximum over all childrenchildren; ;

ELSEELSE IFIF minimizing_level( minimizing_level(depthdepth) ) THENTHEN FOR EACH FOR EACH child child childchild of of boardboard

compute Minimax(compute Minimax(childchild, , depth+1depth+1););

return return minimum over all minimum over all childrenchildren; ; Call: Minimax(Call: Minimax(current_boardcurrent_board, , 00))

7

Alpha-Beta Cut-offAlpha-Beta Cut-off Generally applied optimization on Mini-max.Generally applied optimization on Mini-max.

Instead of:Instead of: firstfirst creating the entire tree (up to depth-level) creating the entire tree (up to depth-level) thenthen doing all propagation doing all propagation

InterleaveInterleave the generation of the tree and the the generation of the tree and the propagation of values.propagation of values.

PointPoint:: some of the obtained valuessome of the obtained values in the tree will in the tree will

provide informationprovide information that that other (non-generated) other (non-generated) parts are parts are redundantredundant and do not need to be and do not need to be generated.generated.

8

MINMIN

MAXMAX

MAXMAX

22

Alpha-Beta idea:Alpha-Beta idea: Principles:Principles:

generate the tree depth-first, left-to-rightgenerate the tree depth-first, left-to-right propagate final values of nodes as initial propagate final values of nodes as initial

estimates estimates for their parent node.for their parent node.

22

55

=2=2

22

11

11

- - The The MINMIN-value (-value (11) is already) is alreadysmaller than the smaller than the MAXMAX-value of-value ofthe parent (the parent (22))

- - The The MINMIN-value can only -value can only decrease further,decrease further,

- - The The MAXMAX-value is only allowed-value is only allowed to increase,to increase,

- - No point in computing further No point in computing further below this nodebelow this node

9

Terminology:Terminology:

- The (temporary) values at- The (temporary) values at MAX MAX-nodes are-nodes are ALPHA- ALPHA-valuesvalues

- The (temporary) values at- The (temporary) values at MINMIN-nodes are-nodes are BETA-valuesBETA-values

MINMIN

MAXMAX

MAXMAX

22

22

55

=2=2

22

11

11

Alpha-valueAlpha-value

Beta-valueBeta-value

10

The Alpha-Beta principles (1):The Alpha-Beta principles (1):

- If an - If an ALPHA-value ALPHA-value is is larger or equallarger or equal than the than the Beta-Beta-value value of a descendant node:of a descendant node:

stop generation of the children of the stop generation of the children of the descendantdescendant

MINMIN

MAXMAX

MAXMAX

22

22

55

=2=2

22

11

11



11

The Alpha-Beta principles (2):The Alpha-Beta principles (2):

- If an - If an Beta-valueBeta-value is is smaller or equalsmaller or equal than the than the Alpha-Alpha-valuevalue of a descendant node:of a descendant node:

stop generation of the children of the stop generation of the children of the descendantdescendant

MINMIN

MAXMAX

MAXMAX

22

22

55

=2=2

22

33

11



12

88 77 33 99 11 66 22 44 11 11 33 55 33 99 22 66 55 22 11 22 33 99 77 22 88 66 44

Mini-Max with Mini-Max with at work:at work:

11

22 88

33

55= 8= 8

44

66 88

77

88 99

99 1111 1313 1717 1919212124242626 2828 3232 3434 3636

1010 221212 441414= 4= 4

1515= 4= 4

441616

1818 112020 332222= 5= 5

3030= 5= 5 55 2323

553131

2525 332727 99 2929 66

3333 113535 223737= 3= 3

33 3838

3939= 5= 5MAXMAX

MINMIN

MAXMAX

11 static evaluations 11 static evaluations saved !!saved !!

13

““DEEP” cut-offsDEEP” cut-offs

- For game trees with at least 4 - For game trees with at least 4 MinMin//MaxMax layers: layers:the the AlphaAlpha - - BetaBeta rules apply also to deeper rules apply also to deeper

levels. levels.

44

44

44

44

44

22

22

14

The Gain: Best case:The Gain: Best case:

MAXMAX

MINMIN

MAXMAX

- If at every layer: the - If at every layer: the best nodebest node is the is the left-most oneleft-most one

Only Only THICKTHICK is explored is explored

15

Example of a perfectly ordered Example of a perfectly ordered treetree

MAXMAX

MINMIN

MAXMAX

21 20 19 24 23 22 27 26 2521 20 19 24 23 22 27 26 2512 11 10 15 14 13 18 17 1612 11 10 15 14 13 18 17 163 2 1 6 5 4 9 8 73 2 1 6 5 4 9 8 7

21 24 2721 24 27 12 15 1812 15 18 3 6 93 6 9

21 12 321 12 3

2121

16

# (static evaluations # (static evaluations saved) =saved) =

How much gain ?How much gain ?

- Alpha / Beta : best case :- Alpha / Beta : best case :

2 2 bbdd/2/2 - 1 (if - 1 (if dd is even) is even)

bb((dd+1)/2+1)/2 + + bb((dd-1)/2-1)/2 - 1 (if - 1 (if dd is odd) is odd)

- - The proof is by induction. The proof is by induction.

- - In the running example: In the running example: dd=3=3, , bb=3=3 : 11 ! : 11 !

17

Best case gain pictured:Best case gain pictured:

1010

100100

10001000

1000010000

100000100000

11 22 33 44 55 66 77

# Static evaluations# Static evaluations

DepthDepth

No pruningNo pruningb = 10b = 10

Alpha-BetaAlpha-BetaBest caseBest case

- - Note: algorithmic scale. Note: algorithmic scale. - - Conclusion: Conclusion: still exponential growth !!still exponential growth !!

- - Worst case??Worst case??For some trees alpha-beta does nothing,For some trees alpha-beta does nothing,For some trees: impossible to reorder to avoid cut-offsFor some trees: impossible to reorder to avoid cut-offs

18

The horinzon effect.The horinzon effect.

Queen lostQueen lost Pawn lostPawn lost

Queen lostQueen lost

horizon = depth boundhorizon = depth boundof mini-max of mini-max

Because of the depth-boundBecause of the depth-bound we prefer to delay disasters, although we we prefer to delay disasters, although we don’t don’t prevent them !!prevent them !!

solutionsolution: heuristic continuations: heuristic continuations

19

Heuristic ContinuationHeuristic Continuation

In situations that are identifies as strategically crucialIn situations that are identifies as strategically cruciale.g: king in danger, imminent piece loss, pawn e.g: king in danger, imminent piece loss, pawn

to to become as queens, ... become as queens, ...

extend the search beyond the depth-bound !extend the search beyond the depth-bound !

depth-bounddepth-bound

20

How to organize the continuations?How to organize the continuations?How to control (and stop) extending the tree? How to control (and stop) extending the tree?

Tapering searchTapering search (or: heuristic pruning)(or: heuristic pruning)

Order the moves in 1 layer by quality. Order the moves in 1 layer by quality.

b(b(childchild)) = = b(b(parentparent)) -- (rank child among brothers) (rank child among brothers)

b = 4b = 4

33 22 11 00

22 11 00 11 00

......

21

Time bounds:Time bounds:

How to play within reasonable time bounds? How to play within reasonable time bounds?

Even with fixed depth-bound, times can vary strongly!Even with fixed depth-bound, times can vary strongly!

Solution:Solution: Iterative Deepening !!!Iterative Deepening !!!

Remember: overhead of previous searches = Remember: overhead of previous searches = 1/b1/b

Good investment to be sure to have a move Good investment to be sure to have a move ready.ready.

22

Games of chanceGames of chance

Ex.: Ex.: Backgammon:Backgammon:

Form of the game tree:Form of the game tree:

23

““Utility” propagation with Utility” propagation with chances: chances:

Utility function for a Utility function for a MaximizingMaximizing node node CC : :

( , )

: outcome dice

( ) : probability

( , ) : reachable positions from given

( ) : evaluation of

expectimax( ) ( )max ( )i

i

i i

i i

is S C di

d

P d d

S C d C d

utility s s

C P d utility s

MAXMAX

s1s1 s2s2 s3s3 s4s4

d1d1 d2d2 d3d3 d4d4 d5d5

S(S(CC,,d3d3))

CC

MinMin

Documents

Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games