University of Alber ta
A NewParadigm for Minimax Search
by
AskePlaat, JonathanSchaeffer, Wim Pijls and Arie deBruin
TechnicalReport TR 94±18December1994
DEPARTMENT OF COMPUTING SCIENCEThe University of Alber taEdmonton, Alber ta, Canada
No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
iNo license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
A NewParadigmfor Minimax Search
AskePlaat,ErasmusUniversity, [email protected], Universityof Alberta, [email protected]
Wim Pijls, ErasmusUniversity, [email protected] deBruin, ErasmusUniversity, [email protected]
ErasmusUniversity, Universityof Alberta,Departmentof ComputerScience, Departmentof ComputingScience
P.O.Box 1738, 615GeneralServicesBuilding,3000DR Rotterdam, Edmonton,Alberta,
TheNetherlands CanadaT6G2H1
Abstract
This paperintroducesa new paradigmfor minimax game-treesearchalgo-rithms.MT isamemory-enhancedversionof Pearl'sTestprocedure.By changingtheway MT is called,a numberof best-®rstgame-treesearchalgorithmscanbesimplyandelegantlyconstructed(includingSSS*).
Most of theassessmentsof minimaxsearchalgorithmshavebeenbasedonsimulations.However, thesesimulationsgenerallydonotaddresstwo of thekeyingredientsof highperformancegame-playingprograms:iterativedeepeningandmemoryusage.This paperpresentsexperimentaldatafrom threegame-playingprograms(checkers,Othello andchess),coveringthe rangefrom low to highbranchingfactor. The improvedmoveorderingdueto iterativedeepeningandmemoryusageresultsin signi®cantlydifferentresultsfrom thoseportrayedin theliterature.WhereassomesimulationsshowAlpha-Betaexpandingalmost100%moreleaf nodesthanotheralgorithms[9], our resultsshowedvariationsof lessthan20%.
Onenew instanceof our framework(MTD-f) out-performsour bestalpha-betasearcher(aspirationNegaScout)on leaf nodes,total nodesandexecutiontime. To our knowledge,thesearethe ®rstreportedresultsthat comparebothdepth-®rstandbest-®rstalgorithmsgiventhesameamountof memory.Keywords: Minimax-treesearchalgorithms,Alpha-Beta,SSS*.
iiNo license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
Contents
1 Intr oduction 1
2 Memory-enhancedTest 22.1 A BinaryDecisionProcedure . . . . . . . . . . . . . . . . . . . . . . 22.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Drivers for MT 73.1 Elegance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.2 Ef®ciency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.3 MemoryUsage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.4 StartingValue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4 Experiments 134.1 PreviousWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134.2 ExperimentDesign. . . . . . . . . . . . . . . . . . . . . . . . . . . . 144.3 BaseLine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154.4 ExperimentResults . . . . . . . . . . . . . . . . . . . . . . . . . . . 154.5 ExecutionTime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194.6 MTD-best. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5 Conclusions 20
iiiNo license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
1 Intr oduction
For over 30 years,Alpha-Betahasbeenthe algorithmof choicefor searchinggametrees. Using a simpleleft-to-right depth-®rsttraversal,it is ableto ef®cientlysearchtrees[8, 14]. Severalimportantenhancementswereaddedto the basicAlpha-Betaframework,includingiterativedeepening,transpositiontables[24], thehistoryheuris-tic [22] andminimal window searching[2, 13, 18]. Severalstudiesshowedthat thealgorithm's ef®ciencywas approachingoptimal and that therewas little room forimprovement[4, 5, 22].
In 1979,Stockmanintroducedthe SSS* algorithm which was able to provablyexpandfewerleafnodesthanAlpha-Betaby adoptingabest-®rstsearchstrategy[25].Scienti®cskepticismquickly followed as it becameevident that SSS* had seriousimplementationdrawbacks.Theseproblemsrevolvedaroundthe OPENlist, a datastructurewhosesizewasexponentialin thedepthof thesearchtree,andtheexpensiveoperationsrequiredto maintainthelist in sortedorder[20]. Nevertheless,thepotentialfor buildingsmallersearchtreeshadbeendemonstrated.Unfortunately, it seemedthatSSS*wasaninterestingideafor thetheoreticians,but it failedto becomeanalgorithmusedby thepractitioners.
This paperintroducestheMemory-enhancedTest(MT) algorithm,a variationonPearl's Testprocedure[13]. This routineef®cientlysearchesa gametreeto answera binary question. MT canbe calledfrom a simpledriver routinewhich may makerepeatedcalls to MT. By usingmemoryto storepreviouslyseennodes(a standardtranspositiontable),MT canbeusedef®cientlyto re-searchtrees.
A searchyielding a singlebinarydecisionis usuallynot usefulfor game-playingprograms.Repeatedcalls to MT canbeusedto homein on the minimaxvalue. ByconstructingdifferentdriverprogramsthatcallMT, differentalgorithmscanbecreated.In particular, a variety of best-®rstalgorithmscanbe implementedusingdepth-®rstsearch,suchasSSS*andDUAL* [9,19]. Thesurprisingresultis thataseriesof binary-valuedsearchesis moreeffectiveatdeterminingtheminimaxvaluethansearchesovera wide rangeof values. It is interestingto notethat the basicAlpha-BetaalgorithmcannotbecreatedusingMT, becauseits widesearchwindowcausesfewercutoffs thanMT-basedalgorithms.
TheMT drivers(MTD) caneasilybechangedto createnewalgorithms.Somenewalgorithmsareintroducedin this paperandoneof them,MTD-f, out-performsSSS*,DUAL*, Alpha-BetaandPVS/NegaScout[2, 18] onaverage.While SSS*uses+∞ asits initial searchboundandDUAL* uses ∞, MTD-f usesthe resultof thepreviousiterationin aniterativedeepeningsearch.Startingwith aboundcloserto theexpectedoutcomeincreasessearchef®ciency, aswewill show.
Most papersin the literaturecomparegame-treesearchalgorithmsusingsimula-tions. Typically, thesimulationsdo not useiterativedeepeningor transpositiontables(for example,[7, 9, 12]). Sincethesearenecessaryingredientsfor achievinghighper-formancein realapplications,this is a seriousomission.Ratherthanusesimulations,our resultshavebeentakenfrom threegame-playingprograms. Theseinclude thecheckersprogramChinook(a gamewith low branchingfactor), theOthelloprogramKeyano(mediumbranchingfactor) andthe chessprogramPhoenix(high branchingfactor). All threeprogramsarewell-knownin their respectivedomains.Experimen-tally comparingavarietyof best-®rstanddepth-®rstalgorithmsusingthesamestorage
1No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
requirementspaintsaverydifferentpictureof therelativestrengthsof thealgorithms.In particular, theresultsof non-iterativedeepeningexperimentsreportedin thelitera-turearemisleading.With iterativedeepeningandtranspositiontables,moveorderingis improvedto thepointwhereall thealgorithmsstartapproachingtheminimalsearchtreeandthedifferencesbetweenthealgorithmsbecomelesspronounced.
Since binary decisionproceduresare shown to be more effective than wide-windowedsearches,thereis no goodreasonfor continuedusageof wide-windowedAlpha-Beta. Although every introductorybook on arti®cial intelligencediscussesAlpha-Beta,we would arguethatMT is a simpleralgorithm,is easierto understand,andis moreef®cientthanAlpha-Beta. To put it bluntly, why would anyonewant touseAlpha-Beta?
2 Memory-enhancedTest
This sectiondiscussestheadvantagesof usinga binarydecisionprocedurefor doinggame-treesearch.
2.1 A BinaryDecisionProcedurePearlintroducedtheconceptof aproofprocedurefor gametreesin hisScoutalgorithm[15]. A proof proceduretakesa searchassertion(for example,theminimaxvalueofthetreeis ≥ 10) andreturnsa binaryvalue(assertionis trueor false). It wasinventedto simplify theanalysisof theef®ciencyof Alpha-Beta. In doing theanalysis,Pearldiscoveredthatanalgorithmusingabinarydecisionprocedurecouldsearchtreesmoreef®ciently. Theproof procedurewascalledTest,sinceit wasusedto testwhethertheminimaxvalueof asubtreewould lie aboveor belowaspeci®edthreshold.ScoutwasthedriverprogramthatcalledTest.Enhancementsto thealgorithmweredevelopedandanalyzed(thisiswell documentedin [19]). It turnsoutthatScout-basedalgorithmswillneverconsidermoreuniqueleaf nodesthanwould Alpha-Beta. For thespecialcaseof aperfectlyorderedtreebothAlpha-BetaandtheScout-variantssearchtheso-calledminimal search tree [8]. Simulationshaveshownthat, on average,Scout-variants(suchasNegaScout)signi®cantlyout-performAlpha-Beta[7, 9]. However, whenusedin practicewith iterativedeepening,aspirationsearchingandtranspositiontables,thequality of themoveorderinggreatlyincreases.As a result,therelativeadvantageofNegaScoutsigni®cantlydecreases[22].
Testdoesanoptimaljob of answeringa binary-valuedquestion.However, game-treesearchesareusuallyoverarangeof values.Wewouldlike to usetheef®ciencyofTestto®ndtheminimaxvalueof asearchtree.RepeatedcallstoTestwill beinef®cient,unlessTestis modi®edto reusetheresultsof previoussearches.EnhancingTestwithmemoryyieldsanewalgorithmwhichwecall MT (shortfor Memory-enhancedTest),asillustratedin ®gure1. Thestoragecanbeorganizedasa familiar transpositiontable[10]. Beforea nodeis expandedin a search,a checkis madeto seeif thevalueof thenodeis availablein memory, theresultof a previoussearch.Lateronwewill seethataddingstoragehassomeotherbene®tsthatarecrucialfor thealgorithm's ef®ciency.
We would like to formulateour versionof Testasclearandconciseaspossible,usingtheoftenusedNegamaxformulation[8]. Hereweencounteraproblem.Supposewewantto testwhethertheminimaxvalueof maximizingnoden is at leastω. A callTest(n, ω) would do this. A child returninga value≥ ω is suf®cientto answerthe
2No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
question,andthereforecausesacutoff. Unfortunately, for aminimizingnode,acutoffshouldoccuronly whenits returnvalue g > ω (usinga Negamaxformulation).Thereis a simpleway of removingtheª>º/ª ≥º asymmetry:This canbeobtainedbyusinginputparametersof theform ω ε andω + ε, whereε is avaluesmallerthanthedifferencebetweenanytwo leafnodeevaluations.ThenMT(n, ω ε) guaranteesthatall nodesin thetreeareeithergreaterthanor lessthanω ε; no tiesarepossible.So,therearetwo differencesbetweenMT andTest. Oneis thatMT is calledwith ω εinsteadof ω, sothatthereis noneedfor theª=º partin thecutoff test,obviatingextratestsin thecodeof MT or arti®cialincrementing/decrementing of theinputparameter(asin [11]). Thesecond(andmorefundamental)differenceis thatMT usesstoragetopasson searchresultsfrom onepassto thenext,makingef®cientmulti-passsearchespossible.
Figure1 showsthepseudo-codefor MT. Theroutineassumesanevaluateroutinethat assignsa value to a node. Determiningwhen to call evaluateis application-dependentand is hiddenin the de®nitionof the conditionn = leaf. For a depthd®xed-depthsearch,a leaf is anynodethat is d movesfrom the root of the tree. Thesearchreturnsanupperor lower boundon thesearchvalueat eachnode,denotedbyƒ+ andƒ respectively. Beforesearchinganode,thetranspositiontableinformationisretrievedand,if it hasbeenpreviouslysearcheddeepenough,thesearchis cutoff. Atthecompletionof a node,theboundon thevalueis stored in the transpositiontable.Theboundsstoredwith eachnodearedenotedusingPascal's dot-notation.
function MT(n, γ) → g;{ precondition:γ = anyleaf-evaluation;MT mustbecalledwith γ = ω ε to proveg < ω or g ≥ ω }
if retrieve(n) = foundthenif n.ƒ > γ then return n.ƒ ;if n.ƒ+ < γ then return n.ƒ+;
if n = leaf thenn.ƒ+ := n.ƒ := g := evaluate(n);
elseg := ∞;c := ®rstchild(n);while g < γ and c = do
g := max(g, MT(c, γ));c := nextbrother(c);
if g < γ then n.ƒ+ := g elsen.ƒ := g;store(n);return g;
Figure1: MT, amemory-enhancedversionof Pearl's Test.
Usually we want to know morethanjust a boundon the minimax value. Usingrepeatedcallsto MT, thesearchcanhomein ontheminimaxvalueuntil it is found.Toachievethis,MT mustbecalledfromadriverroutine.Oneideafor suchadriverwould
3No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
function MTD+∞(n) → ƒ;g := +∞;repeat
bound:= g;g := MT(n, bound ε);
until g = bound;return g;
Figure2: MTD+∞: A sequenceof MT searchesto ®ndƒ.
beto startatanupperboundfor thesearchvalue,ƒ+ = +∞. Subsequentcallsto MT canlower this bounduntil theminimaxvalueis reached.Figure2 showsthepseudocodefor this driver calledMTD+∞. Thevariableg is at all timesanupperboundƒ+ on theminimaxvalueof the root of thegametree[17]. Surprisingly, MTD+∞ expandsthesameleaf nodesin the sameorderasSSS*,providedthat no collisionsoccurin thetranspositiontable,andthatits sizeis of orderO(w d/2 ) [3, 17].
2.2 ExampleThe following exampleillustrateshow MTD+∞ traversesa tree as it computesasequenceof upperboundson thegamevalue. For easeof comparison,the exampletreein ®gure3 is thesameashasbeenusedin otherpapers[15]. Figures4±7showthefour stagesinvolvedin building thetree. In these®gures,theg valuesreturnedbyMT aregivenbesideeachinterior node. Maximizing nodesaredenotedby a square;minimizingnodesby a circle. Thediscussionusesthecodein ®gures1 and2.
a
b
c
d
e
41l
5
ƒ
g
12 90 101 80 20 30
h
i
j
k
34 80
l
m
36o
35
p
q
r
50s
36 25 3
Figure3: Exampletreefor MTD+∞.
First Pass:(®gure4)MTD+∞ initially startswith a valueof +∞ (1000for our purposes).MT expandsallbranchesatmaxnodesandasinglebranchataminnode.In otherwords,themaximumnumberof cutoffs occursinceall valuesin the treeare< 1000 ε (< ∞). Figure4showstheresultingtreethatis searchedandstoredin memory. Theminimaxvalueofthistreeis 41. Notethatatmaxnodesa, c andi anƒ+ valueis savedin thetransposition
4No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
ƒ+ = 41 a
ƒ = 41 b
ƒ+ = 41 c
ƒ = 41 d
e
41
ƒ ƒ = 12
g
12
h ƒ = 36
i ƒ+ = 36
ƒ = 34 j
k
34
l ƒ = 36
m
36
Figure4: Pass1.
ƒ+ = 36 a
ƒ = 12 b
ƒ+ = 12 c
ƒ = 5 d
e
41
n
5
ƒ = 12
ƒskipped
ƒ = 36
h
skipped
Figure5: Pass2.
table,while anƒ valueis savedatmin nodesb, d, ƒ,h, j andl. Bothboundsarestoredat leaf nodese, g, k andm sincethe minimax value for that nodeis exactlyknown.Thesestorednodeswill beusedin thesubsequentpasses.SecondPass:(see®gure5)MT is calledagainsincethe previouscall to MT improvedthe upperboundon theminimaxvaluefrom 1000to 41 (thevariableg in ®gure2). MT now attemptsto ®ndwhetherthe treevalueis lessthanor greaterthan41 ε. The left-mostpath fromthe root to a leaf, whereeachnodealongthat pathhasthesameg valueasthe root,is calledthe critical path or principal variation. The patha, b, c, d down to e is thecritical paththatis descendedin thesecondpass.edoesnothaveto bereevaluated;itsvaluecomesfrom thetranspositiontable.Sinced is aminimizingnode,the®rstchildedoesnotcauseacutoff (value< (41 ε)) andchild n mustbeexpanded.n's valuegetsbackedup to c, who thenhasto investigatechild ƒ. Theboundon ƒ, computedin the previouspass,causesthe searchto stopexploringthis branchimmediately. ctakeson themaximumof 12and5, andthisbecomesthenewvaluefor b. Sinceh hasavalue< 41 ε (from thepreviouspass),thesearchis complete;bothof a's childrenprovethata hasa valuelessthan41. Theresultingtreede®nesa newupperboundof
5No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
36on theminimaxvalue.
ƒ+ = 35 a
ƒ = 12
b
skipped
h ƒ = 35
i ƒ+ = 35
ƒ = 34
j
skipped
l ƒ = 35
m
36o
35
Figure6: Pass3.
Third Pass:(see®gure6)The searchnow attemptsto lower the minimax valuebelow 36. From the previoussearch,weknowb hasavalue< 36buth doesnot. Thealgorithmfollows thecriticalpathto thenodegiving the36 (h to i to l to m). m's valueis known,andit is not lessthan36. Thusnodeo mustbe examined,yielding a valueof 35. From i's point ofview, thenewvaluefor l (35) andtheboundon j 's value(34 from the®rstpass)arebothlessthan36. i getsthemaximum(35) andthis getspropagatedto theroot of thetree.Theboundon theminimaxvalueat theroothasbeenimprovedfrom 36 to 35.FourthPass:(see®gure7)Theprevioussearchloweredtheboundfrom 36 to 35, meaningconvergencehasnotoccurredandthe searchmustcontinue. The searchfollows the critical patha, h, i, lando. At nodel, bothits childrenimmediatelyreturnwithouthavingbeenevaluated;theirvalueis retrievedfrom thetranspositiontable.Notethatthepreviouspassstoredanƒ valuefor l, while this passwill storea ƒ+. Thereis roomfor optimizationhere
a ƒ = 35
ƒ = 12
b
skipped
h ƒ+ = 35
ƒ = 35 i
ƒ = 34
j
skipped
l ƒ+ = 35
m
36o
35
p ƒ = 36
q ƒ+ = 36
r
50s
36
Figure7: Pass4.
6No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
by recognizingthatall of l's childrenhavebeenevaluatedandthusweknowtheexactvaluefor l (see[16, 18]). The valueof l doesnot changeand j 's boundprecludesitfrom beingsearched,thusi's valueremainsunchanged.i cannotlower h's value(nocutoff occurs),so thesearchexploresp. p considersq which, in turn, mustsearchrands. Sincep is a maximizingnode,thevalueof q (36) causesa cutoff. Both of h'schildrenaregreaterthan35 ε. Nodea wassearchedattemptingto showwhetherit' s valuewaslessthanor greaterthan35 ε. h providesthe answer;greaterthan.This call to MT fails high, meaningwe havea lower boundof 35 on thesearch.Thepreviouscall to MT establishedanupperboundof 35. Thustheminimaxvalueof thetreeis provento be35.
2.3 DiscussionMTD+∞ causesMT to expandthesameleafnodesin thesameorderasSSS*(aproofcanbefound in [17]). How doesMT, a depth-®rstsearchprocedure,examinenodesin a best-®rstmanner?Thevalueof ƒ+ ε causesMT to exploreonly nodesthatcanlower theupperboundat theroot. This is thebest-®rstexpansionorderof SSS*: inthethird passof theexample,loweringthevalueof nodeb cannotin¯uenceƒ+(a), butnodeh can.Selectingnodeh givesabest-®rstexpansionorder.
Theexampleillustratesthatstorageiscritical to theperformanceof amulti-passMTalgorithm. Without it, theprogramwould revisit interior nodeswithout thebene®tofinformationgainedfrom theprevioussearch.Instead,MT canretrievevaluablesearchinformationfor anode,suchasanupperor lowerbound,usingarelativelycheaptablelookup. The storagetableprovidestwo bene®ts:(1) preventingunnecessarynodere-expansion,and(2) guidingMT alongthecritical path,ensuringabest-®rstselectionscheme.Botharenecessaryfor theef®ciencyof thealgorithm.
Onecouldaskthe questionwhethera simpleone-passAlpha-Betasearchwouldnot be as ef®cient. To seewhy MT is more ef®cientit helpsto think of MT as anull-window Alpha-Betasearch(β = α + 1) with a transpositiontable. Null-windowsearchinghasbeenshownequivalentto Test and hasbeendiscussedby numerousauthors[2, 9, 15, 19]. Variouspaperspoint out that a tighter Alpha-Betawindowcausesmorecutoffs thana wider window, all otherthingsbeingequal(for example,[2, 17]). SinceMT doesnot re-expandnodesfrom a previouspass,it cannothavefewercutoffs thanwide-windowedAlpha-Betafor newleaf nodes.This impliesthatanysequenceof MT callswill bemoreef®cient(it will neverexpandmoreleafnodesandusuallysigni®cantlylessnodes)thanacall to Alpha-Betawith window( ∞, +∞).
3 Drivers for MT
Havingseenonedriver for MT, theideascanbeencompassedin a generalizeddriverroutine.Thedrivercanberegardedasprovidingaseriesof callsto MT to successivelyre®neboundson theminimaxvalue.
The driver codecanbe parameterizedso that onepieceof codecanconstructavarietyof algorithms.Thetwo parametersneededare:
®rst The®rststartingboundfor MT.
next A searchhasbeencompleted.Useits resultto determinethenextboundfor MT.
7No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
function MTD(®rst,next,n) → ƒ;ƒ+ := +∞; ƒ := ∞;bound:= g := ®rst;repeat
{ Thenextoperatormustsetthevariablebound}next;g := MT(n,bound ε);if g < boundthen ƒ+ := g elseƒ := g;
until ƒ = ƒ+;return g;
Figure8: A frameworkfor MT drivers.
Usingtheseparameters,analgorithmusingourMT driver, MTD, canbeexpressedasMTD(®rst,next). Thecorrespondingpseudocodecanbefoundin ®gure8. A numberof interestingalgorithmscaneasilybeconstructedusingMTD. In the following, +∞and ∞ areusedastheupperandlower boundson therangeof leaf values.In actualimplementations,theseboundsaresuitablylarge®nitenumbers.
• SSS*[MTD+∞]MTD+∞ is justSSS*andcanbedescribedas:
MTD(+∞, bound:= g).
• DUAL* [MTD ∞]Theliteraturedescribesadualversionof SSS*,whereminimizationis replacedby maximization,theOPENlist is kept in reversedorder, andthestartvalueis
∞ [9, 19]. Thisalgorithmbecomes:
MTD( ∞,bound:= g + 1).
Theadvantageof DUAL* overSSS*liesin thesearchof odd-depthsearchtrees.Thetreesthatde®nesucha boundarecalledsolutiontrees[16, 25]. An upperboundis de®nedby a maxsolutiontree,a lower boundby a min solutiontree.In a maxsolutiontreeall childrenof maxnodesareincorporated,andexactlyonechild of amin node.Dually for amin solutiontree.
SinceSSS*buildsandre®nesa maxsolutiontreeof sizeO(w d/2 ) on uniformtrees,DUAL* buildsandre®nesmin solutiontreesof sizeO(w d/2 ) [19]. Sincethemin solutiontreesaresmaller, fewerleafnodesareexpandedandlessstorageis required. If, for example,w = 40 andd = 9 thenit makesa big differencewhetheronehasto searchandstorea treewith 404 or 405 leaves.
(Alternatively, for reasonsof symmetry, we could replacethe seventhline of®gure8 by g := MT(n, bound+ ε). In this way DUAL* canbe expressedasMTD( ∞,bound:= g), which is morelike thedriver for SSS*.
• MTD-biSinceMT can be usedto searchfrom above(SSS*) as well as from below
8No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
(DUAL*), anobvioustry is to bisecttheintervalandstartin themiddle. Sinceeachpassproducesanupperor lower bound,we cantakesomepivot valueinbetweenasthenextcenterfor oursearch.In MTD terms,bisectingtherangeofinterestbecomes:
MTD(avg(+∞, ∞), avg(ƒ+, ƒ ))
whereavgdenotestheaverageof two values.To reducebig swingsin thepivotvalue,somekind of aspirationsearchingwill bebene®cialin manyapplicationdomains[22].
Weill introducedasimilaralgorithmwhichhenamedNegaC*[26]. Hedoesnotstatethelink with best-®rstSSS*-likebehavior, nor thefact thatthis algorithm,providedthereis enoughstorage,will neverexpandmoreleafnodesthanAlpha-Beta.
• MTD-fRatherthanarbitrarily usingthe mid-point asan approximation,any informa-tion on wherethe valueis likely to lie canbe usedasa betterapproximation.Giventhatiterativedeepeningis usedin manyapplicationdomains,theobviousapproximationfor theminimaxvalueis theresultof thepreviousiteration. InMTD termsthisalgorithmbecomes:
MTD(approximation,if g < boundthen bound:= g elsebound:= g + 1).
MTD-f can be viewed as startingcloseto ƒ, and then doing either SSS* orDUAL*, skippinga largepartof theirsearchpath.
• MTD-stepInsteadof makingtiny jumpsfrom oneboundto the next, as in all the abovealgorithmsexceptMTD-bi, wecouldmakebiggerjumps:
MTD(+∞,bound:= max(ƒroot + 1, g stepsize))
(or thedualversion)wherestepsizeis somesuitablylargevalue.
• MTD-bestIf wearenot interestedin thegamevalueitself, butonly in thebestmove,thenastopcriterionsuggestedby Berlinercanbeused[1]. Wheneverthelowerboundof onemoveis not lower thantheupperboundsof all othermoves,it is certainthat this mustbe the bestmove. To provethis, we haveto do lesswork thanwhenwesolvefor ƒ, sincenoupperboundon thevalueof thebestmovehastobecomputed.We canuseeithera prove-beststrategy(establisha lower boundononemoveandthentry to createanupperboundontheothers)ordisprove-rest(establishanupperboundon all movesthoughtto be inferior andtry to ®ndalower boundon the remainingmove). The stopcriterion in ®gure8 mustbechangedto ƒbestmove≥ ƒ+
othermoves. Notethatthis strategyhasthepotentialto buildsearchtreessmallerthantheminimalsearchtree.
9No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
Dual*
∞
ƒ
ƒ
ƒ
SSS*
+∞
ƒ+
ƒ+
ƒ
MTD-step
+∞
ƒ+
ƒ
ƒ
+∞
∞ƒ
ƒ+
ƒ
ƒ+
ƒ+
ƒ
ƒ
MTD-bi MTD-f
+∞
∞ƒ
ƒ+
h
ƒ
ƒ
Figure9: MT-basedalgorithms.
Theknowledgewhichmoveshouldberegardedasbest,andasuitablevaluefor a®rstguess,mightbeobtainedfrom apreviousiterationin aniterativedeepeningscheme.Thenotionof whichmoveis bestmightchangeduringthesearch.Thismakesfor aslightly morecomplicatedimplementation.
Notethatwhile all theabovealgorithmsusea transpositiontable,not all of themneedto savebothƒ+ andƒ values.Forexample,sinceSSS*alwaystriesto lowertheboundat theroot, thetranspositiontableneedsto storeonly onebound.
Figure9 illustratesthedifferentstrategiesusedby theabovealgorithmsfor con-verging on theminimaxvalue.
TheMTD frameworkhasa numberof importantadvantagesfor reasoningaboutgame-treesearchalgorithms.
3.1 EleganceFormulatingaseeminglydiversecollectionof algorithmsinto oneunifying frameworkfocusesattentionon thefundamentaldifferences.Forexample,theframeworkallowsthereaderto seejust how similar SSS*andDUAL* really are,andthatthesearejustspecialcasesof calling Test. Thedriversconciselycapturethealgorithmdifferences.MTD offers us a high-levelparadigmthat facilitatesthe reasoningaboutimportantissueslike algorithm ef®ciencyand memoryusage,without the needfor low-leveldetails.
3.2 Ef®ciency
All thealgorithmspresentedarebasedonMT. SinceMT isequivalenttoanull-windowAlpha-Betacall (plusstorage),theysearchlessnodesthantheinferiorone-passAlpha-Beta( ∞, +∞) algorithm(seealso[17]).
3.3 MemoryUsageAn importantissueconcerningtheef®ciencyof MT-basedalgorithmsismemoryusage.SSS*canbe regardedasmanipulatingonemax solutiontreein place[17]. A max
10No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
solutiontreehasonly onesuccessorateachmin nodeandall successorsatmaxnodes,while theconverseis truefor minsolutiontrees.Whenevertheupperboundis lowered,anew(better)subtreehasbeenexpanded.Thesubtreerootedin theleft brotherof thenewsubtreecanbe deletedfrom memory, sinceit is inferior [9, 21]. Sincethe newbrotheris better, thepreviousbrothernodeswill nevercontainthecritical pathagainandwill neverbe explored(other thanas transpositions).For a max solution tree,this implies that thereis alwaysonly one live child at a min node;bettermovesjustreplaceit. Givena branchingfactorof w anda treeof depthd, thespacecomplexityof a driver causingMT to constructandre®neonemax solutiontreeis thereforeoftheorderO(w d/2 ), andadrivermanipulatingonemin solutiontreeis orderO(w d/2 ).Someof thealgorithms,suchasMTD-bi, usetwo bounds.Theserequirea maxanda min solutiontree, implying that the storagerequirementsareO(w d/2 ). A simplecalculationandempiricalevidenceshowthis to berealisticstoragerequirements.
A transpositiontableprovidesa¯exiblewayof storingsolutiontrees.While atanytimeentriesfrom old (inferior) solutiontreesmayberesident,theywill beoverwrittenby newerentrieswhentheirspaceis needed.Garbagewill collectedincrementally. Aslongasthetableis big enoughto storethemin or amaxsolutiontreesthatareessentialfor the ef®cientoperationof the algorithm, it providesfor fast accessand ef®cientstorage.(Not to mentionthemoveorderingandtranspositionbene®ts.)
3.4 StartingValuePerhapsthe biggestdifferencein theMTD algorithmsis their ®rstapproximationoftheminimaxvalue:SSS*is optimistic,DUAL* is pessimisticandMTD-f is realistic.Onaverage,thereis arelationshipbetweenthestartingboundandthesizeof thesearchtreesgenerated.Wehaveconductedteststhatshowthatasequenceof MT searchesto®ndthegamevaluebene®tsfrom astartvaluecloseto thegamevalue.Valueslike +∞or ∞ asin SSS*andin DUAL* arein asensetheworstpossiblechoices.
By doingminimalwindowsearches,wehopetoestablishtheleft-mostminsolutiontreeandtheleft-mostmaxsolutiontreeasef®cientlyaspossible.Doingsearcheswith adifferentwindowcancausenodesto beexpandedthatareirrelevantfor thisproof. Wehavebeenunableto formulatethisanalytically, andinsteadpresentempiricalevidenceto supportthisconjecture.
Figure10 validatesthe choiceof a startingboundcloseto the gamevalue. The®gureshowsthepercentageof uniqueleaf evaluationsof iterativedeepeningMTD-f.Thedatapointsaregivenasapercentageof thesizeof thesearchtreebuilt by ourbestAlpha-Beta-searcher(AspirationNegaScout).(Sinceiterativedeepeningalgorithmsareused,thecumulativeleaf countoverall previousdepthsis shownfor thedepths.)Givenan initial guessof h andtheminimaxvalueof ƒ, theoptimalsearchvalueforMTD-f, thegraphplotsthesearcheffort expendedfor differentvaluesof h ƒ. To theleft of thegraph,MTD-f is closerto DUAL*, to theright it is closerto SSS*. To bebetterthanAspirationNegaScout,analgorithmmustbelessthanthe100%baseline.A®rstguessclosetoƒmakesMTD-f performbetterthanthe100%AspirationNegaScoutbaseline.Theguessmustbecloseto ƒ for theeffect to becomesigni®cant.Thus,ifMTD-f is to be effective, the ƒ obtainedfrom the previousiterationmustbe a goodindicatorof thenextiteration's value.
11No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
88
90
92
94
96
98
100
102
104
-40 -20 0 20 40
Cum
ulat
ive
Leav
es R
elat
ive
to A
sp N
S (
%)
Difference of first guess from f
Checkers, Average over 20 Treesdepth 13depth 15
94
96
98
100
102
104
106
108
110
-30 -20 -10 0 10 20 30
Cum
ulat
ive
Leav
es R
elat
ive
to A
sp N
S (
%)
Difference of first guess from f
Othello, Average over 20 Treesdepth 8depth 9
84
86
88
90
92
94
96
98
100
-40 -20 0 20 40
Cum
ulat
ive
Leav
es R
elat
ive
to A
sp N
S (
%)
Difference of first guess from f
Chess, Average over 20 Treesdepth 6depth 7
Figure10: Treesizerelativeto the®rstguessƒ.
12No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
4 Experiments
Therearethreewaysto evaluatea new algorithm: analysis,simulationor empiricaltesting. The emphasisin the literaturehasbeenon analysisandsimulation. This issurprisinggiventhelargenumberof game-playingprogramsin existence.
4.1 PreviousWorkThemathematicalanalysisof minimaxsearchalgorithmsdo a goodjob of increasingourunderstandingof thealgorithms,butfail togivereliablepredictionsof performance(for example,[6, 15, 16]). Theproblemis thatthegametreesthatareanalyzeddifferfrom thetreesgeneratedby realgame-playingprograms.
To overcomethisde®ciency, anumberof authorshaveconductedsimulations(forexample,[7, 9, 12, 19]). In our opinion,thesimulationsdid not capturethebehaviorof realisticsearchalgorithmsastheyareusedin game-playingprograms.Instead,wedecidedtoconductexperimentsin asettingthatwastobeasrealisticaspossible.Theseexperimentsattemptto addresstheconcernswehavewith thesimulationparameters:
• Variablebranchingfactor: simulationsusea®xedbranchingfactor.
• High degreeof ordering:mostsimulationshavethequalityof theirmoveorder-ing belowwhatis seenin realgame-playingprograms.
• Iterativedeepening:simulationsuse®xed-depthsearching.Game-playingpro-gramsuseiterativedeepeningto seedmemory(transpositiontable)with bestmovesto improvethemoveordering.This addsoverheadto thesearch,whichis morethanoffsetby theimprovedmoveordering.
Iterativedeepeningimposesa dynamicorderingon the tree. This meansthatdifferentalgorithmswill searcha treedifferently. Thus,proofsthatone®xed-depthalgorithmsearcheslessleavesthananother®xed-depthalgorithm(suchas[25]), donothold for iteratively-deepened searches.
• Memory: simulationsassumeeitherno storageof previouslycomputedresults,or unfairly bias their experimentsby not giving all the algorithmsthe samestorage. For iterativedeepeningto be effective, bestmove informationfrompreviousiterationsmust be savedin memory. In game-playingprogramsatranspositiontable is used. For bestperformance,this table may haveto beof considerablesize,sincethe moveorderinginformationof the nodesof theprevioussearchtree mustbe saved(which is exponentialin the depthof thesearch).
• Treesize:simulationsoftenuseaninconsistentstandardfor countingleafnodes.In conventionalsimulations(for example,[9]) eachvisit toaleafnodeiscountedfor depth-®rstalgorithmslike NegaScout,whereastheleaf is countedonly oncefor best-®rstalgorithmslike SSS* (becauseit was storedin memory, no re-expansionoccurs). This problemis a resultof thememoryproblemdescribedabove.
• Executiontime: simulationsareconcernedwith treesize,but practitionersareconcernedwith executiontime. Simulationsresultsdonotnecessarilycorrelate
13No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
well with executiontime. For example,therearemanypapersshowingSSS*expandsfewer leaf nodesthanAlpha-Beta. However, SSS* implementationsusingStockman's original algorithmhavetoo muchexecutionoverheadto becompetitivewith Alpha-Beta.
• Valuedependence:somesimulationsgeneratethevalueof a child independentof thevalueof theparent.However, thereis usuallya highcorrelationbetweenthevaluesof thesetwo nodesin realgames.
4.2 ExperimentDesignTo assessthe feasibility of the proposedalgorithms,a seriesof experimentswasperformedto compareAlpha-Beta,NegaScout,SSS*(MTD+∞), DUAL* (MTD ∞),MTD-f, andMTD-best.
Ratherthanusesimulations,our datahasbeengatheredfrom threegame-playingprograms:Chinook(checkers)[23], Keyano(Othello)andPhoenix(chess)[21]. Allthreeprogramsarewell-knownin their respectivedomain. For our experiments,weusedtheoriginalauthor's searchalgorithmwhich,presumably, hasbeenhighly tunedto the application. The only changewe madewasto disablesearchextensionsand,in the caseof ChinookandPhoenix,forward pruning. All programsusediterativedeepening.TheMTD algorithmswouldberepeatedlycalledwith successivelydeepersearchdepths.All threeprogramsusedastandardtranspositiontablewith amaximumof 220 entries. Testingshowedthat the solutiontreescould comfortably®t in tablesof this size. For our experimentswe usedtheoriginal programauthor's transpositiontable data structuresand code without modi®cation. 1 At an interior node, themove suggestedby the transpositiontable is alwayssearched®rst(if known), andthe remainingmovesareorderedbeforebeingsearched.ChinookandPhoenixusedynamicorderingbasedon thehistoryheuristic[22], while Keyanousesstaticmoveordering.
All threeprogramsusetranspositiontableswith only onebound(in contrastwithourcodein ®gure1). SinceMTD-bi andMTD-stepmanipulatebothamaxanda minsolutiontree,andthereforeneedto storebothanupperanda lower boundat a node,we do not presentthesetwo algorithms. Oneof the pointswe arestressingin thispaperis easeof implementationof theMT-framework.Theamountof work involvedin alteringtheinformationthatis storedin thetranspositiontableswouldcompromisethisobjective.
The MT codegiven in ®gure1 hasbeenexpandedto include two details,bothof which arecommonpracticein game-playingprograms.The®rstis a searchdepthparameter. Thisparameteris initializedto thedepthof thesearchtree.AsMT descendsthesearchtree,thedepthis decremented.Leaf nodesareat depthzero. Thesecondis thesavingof thebestmoveat eachnode.Whena nodeis revisited,thebestmovefrom theprevioussearchis alwaysconsidered®rst.
Conventionaltestsetsin theliteratureprovedto bepoorpredictorsof performance.Positionsin testsetsareusuallyselectedto testa particularcharacteristicor propertyof thegame(suchastacticalcombinationsin chess)andarenotnecessarilyindicative
1As a matterof fact, sincewe implementedMT usingnull-window alpha-betasearches,we did nothaveto makeanychangesat all to thecodeotherthantheaforementioneddisablingof forwardpruningandsearchextensions.Weonly hadto introducetheMTD drivercode.
14No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
of typical gameconditions. For our experiments,the programsweretestedusingasetof 20 positionsthatcorrespondedto movesequencesfrom tournamentgames.Byselectingmovesequencesratherthanisolatedpositions,we areattemptingto createatestsetthat is representativeof realgamesearchproperties(includingpositionswithobviousmoves,hardmoves,positionalmoves,tacticalmoves,differentgamephases,etc.).
All threeprogramswere run on 20 test positions,searchingto a depthso thatall searchedroughly the sameamountof time. Becauseof the low branchingfactorChinookwasableto searchto depth15, iteratingtwo ply at a time. Keyanosearchedto 9 ply andPhoenixto 7, bothoneply ata time.
4.3 BaseLineMany papersin the literatureuseAlpha-Betaasthebase-linefor comparingtheper-formanceof otheralgorithms(for example,[2, 10]). Theimplicationis thatthis is thestandarddatapointwhicheveryoneis trying tobeat.However, game-playingprogramshaveevolvedbeyondsimpleAlpha-Betaalgorithms.Most useAlpha-Betaenhancedwith minimal window search(usuallyPVS[2] or NegaScout[18]), iterativedeepen-ing, transpositiontables,moveorderingandan initial aspirationwindow. Sincethisis thetypical searchalgorithmusedin high-performanceprograms(suchasChinook,PhoenixandKeyano),it seemsmorereasonableto usethisasourbase-linestandard.
Theworsethebase-linecomparisonalgorithmchosen,thebetterotheralgorithmsappearto be. By choosingNegaScoutenhancedwith aspirationsearching(AspirationNegaScout)asour performancemetric, we areemphasizingthat it is possibleto dobetter than the "best" methodscurrently practicedand that, contrary to publishedsimulationresults,somemethodsareinferior.
BecauseweimplementedtheMTD algorithms(like SSS*andDUAL*) usingMT(equivalentto null-window Alpha-Betacallswith a transpositiontable)we wereableto comparea numberof algorithmsthat werepreviouslyseenasvery different. Byusing MT as a commonproof-procedure,every algorithmbene®tedfrom the sameenhancementsconcerningiterativedeepening,transpositiontables,andmoveorderingcode.To ourknowledgethis is the®rstcomparisonof algorithmsdepth-®rstandbest-®rstminimaxsearchalgorithmswhereall thealgorithmsaregivenidenticalresources.
4.4 ExperimentResultsFigure11 showstheperformanceof Chinook,KeyanoandPhoenix,respectively,
using the numberof leaf evaluations(NBP or Numberof Bottom Positions)as theperformancemetric. Figures12 showtheperformanceof thesealgorithmsusingthenumberof nodesin thesearchtree(interior andleaf) asthemetric. Thegraphsshowthecumulativenumberof nodesoverall previousiterationsfor acertaindepth(whichis realisticsinceiterativedeepeningis used)relativeto AspirationNegaScout.SincewecouldnottestMTD-bi andMTD-stepproperlywithoutmodifyingthetranspositiontablecode,theseresultsarenotshown.Thesearchdepthsreachedbytheprogramsvarygreatlybecauseof thediffering branchingfactors.In checkers,theaveragebranchingfactor is approximately3 (therearetypically 1.2 movesin a captureposition;whileroughly8 in anon-captureposition),in Othelloit is 10andin chessit is 36.
15No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
85
90
95
100
105
110
115
2 4 6 8 10 12 14 16Depth
Chinook, Average over 20 Trees
AspNSAB
MTD-fDual*SSS*
85
90
95
100
105
110
115
2 3 4 5 6 7 8 9 10Depth
Keyano, Average over 20 Trees
AspNSAB
MTD-fDual*SSS*
85
90
95
100
105
110
115
2 3 4 5 6 7 8Depth
Chess - Leaves Relative to Aspiration NegaScout (%)
AspNSAB
MTD(f)Dual*SSS*
Figure11: Leafnodecount
16No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
90
100
110
120
130
140
150
2 4 6 8 10 12 14 16Depth
Chinook, Average over 20 Trees
AspNSAB
MTD-fDual*SSS*
90
100
110
120
130
140
150
2 3 4 5 6 7 8 9 10Depth
Othello - Total nodes Relative to Aspiration NegaScout (%)
AspNSAB
MTD(f)Dual*SSS*
100
120
140
160
180
200
2 3 4 5 6 7 8Depth
Chess, All nodes Relative to Aspiration NegaScout
AspNSAB
MTD(f)Dual*SSS*
Figure12: Totalnodecount
17No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
Over all threegames,the bestresultsarefrom MTD-f. Its leaf nodecountsareconsistentlybetterthanAspirationNegaScout,averagingat leasta 5% improvement.More surprisinglyis thatMTD-f outperformsAspirationNegaScouton thetotal nodemeasureaswell. Sinceeachiterationrequiresrepeatedcalls to MT (at leasttwo andpossiblymany more), one might expectMTD-f to perform badly by this measurebecauseof therepeatedtraversalsof thetree. This suggeststhatMTD-f, on average,is callingMT closeto theminimumnumberof times.Forall threeprograms,MT getscalledbetween3 and4 timesonaverage.In contrast,theSSS*andDUAL* resultsarepoorcomparedto NegaScoutwhenall nodesin thesearchtreeareconsidered.Eachof thesealgorithmsperformsdozensandsometimesevenhundredsof MT searches.
The successof MTD-f implies that it is better to start the searchwith a goodguessasto wheretheminimaxvaluelies, ratherthanstartingat oneof theextremes(+∞ or ∞). Clearly, MTD-f out-performsAlpha-Beta,suggestingthatbinary-valuedsearchesare more ef®cientthan wide-windowedsearches.Of more interestis theMTD-f andNegaScoutcomparison.If bothsearcheshavethecorrectbestmovefromthepreviousiteration,thenboth®ndthevalueof thatmoveandthenbuild proof treesto showtheremainingmovesareinferior. SinceNegaScoutdoesthis with a minimalwindow, it visits exactlythesameleaf nodesthat MT would. Sinceour results(andthosefrom simulations)showNegaScoutto besuperiorto Alpha-Beta,this suggeststhat limiting the useof wide-window searchesis bene®cial. MT takesthis to theextremeby eliminatingall wide-windowsearches.
What is theminimum amountof searcheffort that mustbe doneto establishtheminimaxvalueof thesearchtree? If we know thevalueof thesearchtreeis ƒ, thentwo searchesarerequired:MTD-f(ƒ ε), which fails highestablishinga lowerboundon ƒ, andMTD-f(ƒ + 1 ε), which fails low andestablishesanupperboundon ƒ. Ofthe algorithmspresented,MTD-f hasthe greatestlikelihood of doing this minimumamountof work. Theclosertheapproximationtoƒ, thelesstheworkthathastobedone(seealso®gure10). Consideringthis, it is not a surprisethatbothDUAL* andSSS*comeour poorly. Their initial boundsfor theminimaxvaluearegenerallypoor( ∞and+∞ respectively),meaningthat themanycalls to MT resultin signi®cantlymoreinterior nodes. In the literature,SSS*is regardedasgoodbecauseit expandsfewerleaf nodesthan Alpha-Beta,and bad becauseof the algorithm complexity, storagerequirementsandstoragemaintenanceoverhead.Our resultsgive theexactoppositeview: SSS*is easyto implementbecauseof theMT framework,usesasmuchstorageasAspirationNegaScout,andperformsgenerallyworsethanAspirationNegaScoutwhenviewedin thecontextof iterativedeepeningandtranspositiontables.DUAL* ismoreef®cientthanSSS*but still comesout poorly in all thegraphsmeasuringtotalnodecount.TheChinookleafcountgraphis instructive,sincehereiterativedeepeningSSS*isgenerallyworsethaniterativedeepeningAlpha-Beta.Thisresultrunsoppositeto boththeproofandsimulationsfor ®xeddepthSSS*andAlpha-Beta.An interestingobservationis thattheeffectivenessof SSS*appearsto beafunctionof thebranchingfactor;thelargerthebranchingfactor, thebetterit performs.
AspirationNegaScoutis betterthanAlpha-Beta.Thisresultis consistentwith [22]whichshowedaspirationNegaScoutto beasmallimprovementoverAlpha-Betawhentranspositiontablesanditerativedeepeningwereused.NegaScoutusesawide-windowsearchfor theprincipalvariation(PV)andall re-searches.Thewide-windowPVsearchresultgivesa good®rstapproximationto theminimaxvalue. Thatapproximationis
18No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
then usedto searchthe rest of the tree with minimal window searches,which areequivalentto MT calls. If theserefutationsearchesaresuccessful(no re-searchisneeded),thenNegaScoutdeviatesfrom MTD-f only in thewayit searchesthePV for avalue.MTD-f usesMT alsofor searchingthePV, andfor re-searchesaftera fail high.SinceMTD-f doesnotderiveits ®rstguessfrom thetreelike NegaScoutdoes,it mustget it externally. In our experimentsthe minimax valuefrom the previousiterativedeepeningiterationwasusedfor thispurpose.
Simulationresultsshowthatfor ®xeddepthsearches,without transpositiontablesanditerativedeepening,SSS*,DUAL* andNegaScoutaremajorimprovementsoversimple Alpha-Beta[7, 9, 19]. For example,one study showsSSS* and DUAL*building treesthat are half the size of thosebuilt by Alpha-Beta[9]. This is insharpcontrastto the resultsreportedhere. Why is theresucha disparity with thepreviouslypublishedwork? Includingtranspositiontablesanditerativedeepeningintheexperimentimprovesthesearchef®ciencyin two ways:
• improvemoveorderingsothatthelikelihoodof thebestmovebeingconsidered®rstat acutoff nodeis veryhigh,and
• eliminatelarge portionsof thesearchby havingthesearchtreebe treatedasasearchgraph.A paththat transposesinto a pathalreadysearchedcanreusethepreviouslycomputedresult.
Themoveorderingis improvedto theextentthatall algorithmsareconverging on theminimalsearchtree.
Howeffectiveis themoveordering?At anodewith acutoff, onlyonemoveshouldbeexamined.Datagatheredfrom Phoenix,Keyano,andChinookshowanaverageofaround1.2,1.2 and1.1 move,respectively, areconsideredat cut nodes.On average,over95%of the time themoveorderingis successful.Clearly, the low ply numbersshouldhaveverygoodorderingbecauseof theiterativedeepening(typically 97-98%).Of more interestare these®guresfor the higher ply numbers. Thesenodesoccurnearthebottomof the treeandusuallydo not havethebene®tof transpositiontableinformationto guidethe selectionof the ®rstmoveto consider. Nevertheless,eventhesenodeshaveasuccessratein excessof 90%(80%for Keyano,whichdoesnothavethe history heuristic). CampbellandMarslandde®nestrongly ordered tressto havethebestmove®rst80%of thetime[2]. Kaindl etal. called90%verystronglyordered[7]. Our resultssuggestthereshouldbe a new category, almostperfectlyordered,correspondingto over95%successof the®rstmoveat cutoff nodes.Given that thesimulationsusedpoorermoveorderingthanis seenin practice,it is notsurprisingthattheir resultshavelittle relationto ours.
4.5 ExecutionTimeThebottomline for practitionersisexecutiontime. Sincewedidnothavetheresourcesto run all our experimentson identicalandotherwiseidle machines,we do not showexecutiontime graphs.However, comparingresultsfor thesamemachineswe foundthatMTD-f is consistentlythefastestalgorithm. It is about5% fasterthanAspirationNegaScoutfor checkersandOthello,andabout10%fasterfor chess,(dependingin partonthequalityof thescoreof thepreviousiteration).Wefoundthatfor Phoenixprogram
19No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
executiontime is stronglycorrelatedto the leaf-nodemetric,while for ChinookandKeyanoit is stronglycorrelatedto thetotal-nodemetric.
4.6 MTD-bestThe algorithmsdiscussedso far aredesignedto ®ndthe minimax valueof the tree.MTD-bestis only interestedin thebestmoveandnot thebestvalue. Thusit hasthepotentialto build evensmaller trees. This algorithmusesthe valueof the previousiteration to try to establishonly a lower boundon the bestmove,and then (if thiswassuccessful)anupperboundon theothermoves.MTD-bestprovesthesuperiorityof thebestmoveby only constructinga lower bound. This algorithmsaveson nodeexpansionsbynotconstructinganupperboundonthevalueof thebestmove.Figure13showsthatMTD-bestcanbeslightly betterthanMTD-f. This resultmustbetakeninthe contextthat MTD-bestis trying to solvea simplerproblem: ®ndthe bestmove,not thebestvalue. If thechoiceof bestmoveis thesameasin thepreviousiteration,MTD-bestiseffectiveatprovingthatmoveisbest.Whenthecandidatebestmoveturnsoutnotto bebest,MTD-bestis inef®cientsinceit endsupdoingmorework becauseitsinitial assumption(thebestmove)is wrong. Thuson a givensearch,MTD-bestmaybemuchbetteror muchworsethanMTD-f. In thetestdata,MTD-bestwaseffectiveatprovingthebestmoveabouthalf of thetime. Sincethealgorithmis oftenperformingafew percentbetterthanMTD-f, thepotentialgainsof provingamovebestout-weighthepotentiallosses.
Our implementationof MTD-bestis just a ®rstattempt. We believethat thereisstill roomfor improvingthealgorithm.
5 Conclusions
Overthirty yearsof researchhavebeendevotedto improvingtheef®ciencyof alpha-betasearching. The MT family of algorithmsare comparativelynew, without thebene®tof intenseinvestigations.GiventhatMTD-f is alreadyout-performingourbestalpha-betabasedimplementationsin realgame-playingprograms,futureresearchcanonly makevariationson this algorithmmoreattractive. MT is a simpleandelegantparadigmfor highperformancegame-treesearchalgorithms.
Thepurposeof asimulationis to exactlymodelanalgorithmto gaininsightinto itsperformance.Simulationsareusuallyperformedwhenit istoodif®cult or tooexpensiveto constructtheproperexperimentalenvironment.Thesimulationparametersshouldbechosento comeascloseto exactlymodelingthedesiredscenarioasis possible.Inthecaseof game-treesearching,thecasefor simulationsis weak.Thereis no needtodo simulationswhentherearequality game-playingprogramsavailablefor obtainingactualdata. Further, as this paperhasdemonstrated,simulationparameterscanbeincorrect,resultingin largeerrorsin theresultsthatleadto misleadingconclusions.Inparticular, the failure to includeiterativedeepening,transpositiontables,andalmostperfectlyordered treesin asimulationareseriousomissions.
With easyto understandMT-basedalgorithmsout-performingNegaScout,thisleadsto theobviousquestion:Why areyoustill usingalpha-beta?
20No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
85
90
95
100
105
110
115
2 4 6 8 10 12 14 16Depth
Chinook, Average over 20 TreesAspNS
ABMTD-f
MTD-best
85
90
95
100
105
110
115
2 3 4 5 6 7 8 9 10Depth
Keyano, Average over 20 Trees
AspNSAB
MTD-fMTD-best
85
90
95
100
105
110
115
2 3 4 5 6 7 8Depth
Phoenix, Average over 20 TreesAspNS
ABMTD-f
MTD-best
Figure13: LeafnodecountMTD-best
21No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
Acknowledgements
Thiswork hasbene®tedfrom discussionswith Mark Brockington(authorof Keyano),Yngvi BjornssonandAndreasJunghanns.The ®nancialsupportof the NetherlandsOrganizationfor Scienti®cResearch(NWO), the NaturalSciencesandEngineeringResearchCouncilof Canada(grantOGP-5183)andtheUniversityof AlbertaCentralResearchFundaregratefullyacknowledged.
References
[1] HansJ. Berliner. The B* treesearchalgorithm: A best-®rstproof procedure.Arti®cialIntelligence, 12:23±40,1979.
[2] Murray S.CampbellandT. A. Marsland.A comparisonof minimaxtreesearchalgorithms.Arti®cialIntelligence, 20:347±367,1983.
[3] Arie de Bruin, Wim Pijls, andAske Plaat. Solutiontreesasa basisfor gametree search. TechnicalReportEUR-CS-94-04,Departmentof ComputerSci-ence,ErasmusUniversityRotterdam,P.O. Box 1738,3000DR Rotterdam,TheNetherlands,May 1994.
[4] CarlEbeling.All theRightMoves. MIT Press,Cambridge,Massachusetts,1987.
[5] RainerFeldmann.Spielbaumsuchemit massivparallelenSystemen. PhDthesis,UniversitÈat-Gesamthochschule Paderborn,May 1993.
[6] ToshihideIbaraki. Generalizationof alpha-betaand SSS* searchprocedures.Arti®cialIntelligence, 29:73±117,1986.
[7] HermannKaindl,RezaShams,andHelmutHoracek.Minimaxsearchalgorithmswith andwithout aspirationwindows. IEEE Transactionson PatternAnalysisandMachineIntelligence, PAMI-13(12):1225±1235,December1991.
[8] Donald E. Knuth and RonaldW. Moore. An analysisof alpha-betapruning.Arti®cialIntelligence, 6(4):293±326,1975.
[9] T. A. Marsland,AlexanderReinefeld,andJonathanSchaeffer. Low overheadalternativesto SSS*.Arti®cialIntelligence, 31:185±199,1987.
[10] T. AnthonyMarsland.A reviewof game-treepruning.ICCAJournal, 9(1):3±19,March1986.
[11] T. AnthonyMarslandandAlexanderReinefeld.Heuristicsearchin oneandtwoplayer games. Technicalreport, University of Alberta, PaderbornCenterforParallelComputing,February1993.Submittedfor publication.
[12] Agata Muszyckaand Rajjan Shinghal. An empirical comparisonof pruningstrategiesin gametrees. IEEE Transactionson Systems,Man andCybernetics,15(3):389±399,May/June1985.
[13] JudeaPearl. Asymptotical propertiesof minimax treesand gamesearchingprocedures.Arti®cialIntelligence, 14(2):113±138,1980.
22No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html
[14] JudeaPearl. Thesolutionfor thebranchingfactorof thealpha-betapruningal-gorithmandits optimality. Communicationsof theACM, 25(8):559±564,August1982.
[15] JudeaPearl. Heuristics± Intelligent Search Strategiesfor ComputerProblemSolving. Addison-WesleyPublishingCo.,Reading,MA, 1984.
[16] Wim Pijls andArie deBruin. Searchinginformedgametrees.TechnicalReportEUR-CS-92-02,ErasmusUniversityRotterdam,Rotterdam,NL, October1992.Extendedabstractin ProceedingsCSN 92, pp. 246±256,and Algorithms andComputation,ISAAC 92(T. Ibaraki,ed),pp.332±341,LNCS650.
[17] AskePlaat,JonathanSchaeffer, Wim Pijls, andArie deBruin. Popularmiscon-ceptionsaboutSSS*in practice.TechnicalReportTR-CS-94-17,DepartmentofComputingScience,Universityof Alberta,Edmonton,AB, Canada,December1994.
[18] AlexanderReinefeld.An improvementof theScouttree-searchalgorithm.ICCAJournal, 6(4):4±14,1983.
[19] Alexander Reinefeld. Spielbaum Suchverfahren. volume Informatik-Fachberichte200.SpringerVerlag,1989.
[20] Igor RoizenandJudeaPearl.A minimaxalgorithmbetterthanalpha-beta?Yesandno. Arti®cialIntelligence, 21:199±230,1983.
[21] JonathanSchaeffer. Experimentsin Search andKnowledge. PhDthesis,Depart-mentof ComputingScience,Universityof Waterloo,Canada,1986.AvailableasUniversityof AlbertatechnicalreportTR86-12.
[22] JonathanSchaeffer. The history heuristicandalpha-betasearchenhancementsin practice. IEEE Transactionson PatternAnalysisand MachineIntelligence,PAMI-11(1):1203±1212,November1989.
[23] JonathanSchaeffer, JosephCulberson,NormanTreloar, BrentKnight, PaulLu,andDuaneSzafron.A world championshipcalibercheckersprogram.Arti®cialIntelligence, 53(2-3):273±290,1992.
[24] D.J.SlateandL.R.Atkin. Chess4.5- theNorthwesternUniversitychessprogram.In P.W. Frey, editor, ChessSkill in ManandMachine, pages82±118,NewYork,1977.Springer-Verlag.
[25] G. C. Stockman.A minimaxalgorithmbetterthanalpha-beta?Arti®cialIntelli-gence, 12(2):179±196,1979.
[26] Jean-ChristopheWeill. The NegaC*search. ICCA Journal, 15(1):3±7,March1992.
23No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html