View
217
Download
1
Tags:
Embed Size (px)
Citation preview
Temperature Discovery
Martin Müller, Markus Enzenberger and Jonathan Schaeffer
Introduction: local and global search Local search algorithms Temperature
Environments and coupon stacksTemperature discovery searchFirst results
Local and Global Search
Local search• Partition game into sum of
subgames
• Local analysis
• Problem: how to evaluate local results?
• Central question:which sums of games are wins?
Global search• Single, monolithic game
state
• Full board evaluation
• Single game tree,minimax backup
• Central question:what is the minimax score?
Why Local Search?
Global Alpha-beta: Search time exponential in size of full problem
Local search: time exponential in size of subproblems
exact values in
terminal positions
position
horizon
propagated
values
+ ++ +
1
10
100
1000
10000
100000
1000000
10000000
A A..B A..C A..D A..E A..F A..G A..H A..I A..J A..K A..L A..M
nonesortglobalglobal+sortlocallocal+sortlocal+POprunelocal+POprune+sortLCGS
...
Results of Local Searches1. Exact: combinatorial game value
(Winning Ways, my Ph.D. thesis on Go endgames)
2. Inexact, but “very good”: temperatures, thermographs(Go: Berlekamp, Spight, Fraser, Müller,Amazons: Theo Tegos)
3. Even less exact: heuristic search to estimate the temperature (This work, with Markus and Jonathan)
1. Decomposition Search
Usual: global game tree search
DS:Divide-and-conquer approach
Idea: Divide game into sub-games Do a local search
Combine local results:Combinatorial game theory
A
A
A
A
E
E
F F
B B
D
D
D
D
D
B
D
C
C
C
2. Temperatures, Thermographs
t
score
Left
scaffold
Right
scaffold
mean
temperature
3. Temperature DiscoveryProblem: Thermographs
computed “bottom-up”Needs complete local game treeSometimes too expensiveHeuristic evaluation works well in
global searchIdea: use it in local search to
estimate temperature
Temperature Discovery
A different way to compute temperatures (Berlekamp):
Play local game + “Coupon stack”Choose between play on the board
and “coupon” (move of known value)Temperature of coupon of value t is
t. So can estimate temp of board!
Example
Coupon stack 3,2,1,0,-1
Amazons boardSearch depth 41. B: Coupon(3)
2. W: C8-C7xC83. B: Coupon(2)
4. W: Coupon(1)
9 . . X . 8 . . W . 7 X . . B 6 . X . . A B C D
Example (cont’d)
Uses heuristic evaluation of board
Depth-limited searchResult:
when does it change from taking coupons to board?
Estimate for the temperature
Experiments (1)
Run temp. discovery search on small areas
Compare estimated t against exact t from Theo Tegos’ Databases
Plot real t vs estimated tWorks OK, but still some
problems/bugs?
Experiments (2)
Sample starting positions with 2, 4 and 6 subgames
Subgame size 4x4, 5x5Temperature discovery in
each local gameSimple ‘hotstrat’ playerPlay 2x200 games against
Arrow (full board search)
‘Coupon player’ vs Arrow About 10 sec./move
Two, four, six 4x4 subgames
QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressorare needed to see this picture.
Two, four, six 4x4 subgames
4x4
-2
0
2
4
6
8
10
0 2 4 6 8
4x4
Two and Four 5x5 subgames
5x5
0
2
4
6
8
10
12
14
0 2 4 6 8
5x5
13.25 average over 200 pairs of games (stdDev 11.5)
5x5 subgames
SCORE_SUM
-30
-20
-10
0
10
20
30
40
50
0 50 100 150 200 250
SCORE_SUM
Arrow(10sec) vs Arrow on four 4x4Different time limits for opponent
Control experiment
score
-4
-3
-2
-1
0
1
2
3
0 1 2 3 4 score
5s1s 30s10s
Sample 4x5x5 Game
More experiments, e.g. 6x5x5, 6x6,...Try on real gamesBetter sum game algorithmTune, fix temperature discovery searchOptimal solver? (Needs global search too)The real goal - apply to Go!
To Do...
Summary
Local search algorithmTry to discover temperature by minimax
searchApplications: Amazons, future: GoFirst results: it works...Still lots of open questions