Upload
rafe-watson
View
220
Download
0
Tags:
Embed Size (px)
Citation preview
Games & search
• “unpredictable" opponent– specifying a move for every possible reply
• time limits: unlikely to find goal, approximate
Brief history of game playing…
• Computer considers possible lines of play (Babbage, 1846)
• Algorithm for perfect play (Zermelo, 1912; Von Neumann, 1944)
• Finite horizon, approximate evaluation (Zuse, 1945; Wiener, 1948; Shannon, 1950)
• First chess program (Turing, 1951)
• Machine learning to improve evaluation accuracy (Samuel, 1952-57)
• Pruning to allow deeper search (McCarthy, 1956)
Game tree: 2-player/deterministic/turns
Minimax
• Perfect play for deterministic, perfect-information games• Choose position with highest minimax value
– best achievable payoff against best play
Minimax algorithm
Example
6 7 3 -8 9 8 -1 6 1 0 0 2 4 -1 2 -3
Solution
6 7 3 -8 9 8 -1 6 1 0 0 2 4 -1 2 -3
6 -8 8 -1 0 0 -1 -3
6 8 0 -1
6 -1
6
Properties of minimax• Complete
– Yes, if tree is finite (chess has specific rules for this)
• Optimal– Yes, against an optimal opponent.
• Time complexity– O(bm)
• Space complexity– O(bm) (depth-first exploration)
• for chess, b ≈ 35, m ≈ 100 for “reasonable” games• exact solution completely infeasible
3
2
Optimisation using α-β pruning
3 12 8
3
3
2
14
14 5 2
52
α-β pruning algorithm
Example
6 7 3 -8 9 8 -1 6 1 0 0 2 4 -1 2 -3
Solution
6 7 3 -8 9 8 -1 6 1 0 0 2 4 -1 2 -3
6 <=3 8 0 <=0 -1 -3
6 >=8 0 -1
6 -1
6
-1
Properties of α-β pruning
• pruning does not affect final result
• good move ordering improves effectiveness of pruning
• "perfect ordering," time complexity O(bm/2)
• example of reasoning about which computations are relevant: metareasoning
Evaluation functions• Heuristic for game state:
– order the terminal states correctly– fast to compute– good estimate of “chance of winning”
• Examples:– chess: weighted linear combination of features– e.g. pawn=1, bishop=3, etc.– based on judgements of chess experts
Applying evaluation functions• perform minimax and use the evaluation
function at the maximum depth– e.g. fixed depth limit, iterative deepening
• secondary search:– address nonquiescent states– address the horizon effect– select clearly superior moves
• singular extensions (b=1)
Deterministic games in practice• Checkers:
– Chinook ended 40-year-reign champion (1994)– Perfect play for games of 8 pieces or less (database)
• Othello: – champions won’t play computers (which are too good)
• Go: – champions won’t play computers (as not good enough)– branching factor can be as high as > 300
• Chess: – Deep Blue defeats Kasparov in 6-game match (1997)
Deep blue• 30 node IBM supercomputer (& 480 single-chip
chess search engines). 3 level architecture mixing software and hardware searching.
• 100-200 million evolutions of board configurations per sec (reached 330 at one point).
• Searching with alpha-beta search with complex evaluation function
• In 3 minutes for each move can search full width 12 deep and some paths up to 40 deep.
• Optimised with knowledge of opening and end games.
Games with an element of chance• expectiminimax algorithm
• chance nodes: use weighted sum of probabilities
Samuel’s checker-playing program
• Arthur Samuel (IBM)
• Learnt own evaluation function– tuned the weights of a weighted linear
function (up to 16 terms) – used comparison with full search
• Remembered evaluation function values– extends the effective depth of the search