27
University of Alberta A New Paradigm for Minimax Search by Aske Plaat, Jonathan Schaeffer, Wim Pijls and Arie de Bruin Technical Report TR 94±18 December 1994 DEPARTMENT OF COMPUTING SCIENCE The University of Alberta Edmonton, Alberta, Canada No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

MTD-f

Embed Size (px)

DESCRIPTION

MTD-f

Citation preview

Page 1: MTD-f

University of Alber ta

A NewParadigm for Minimax Search

by

AskePlaat, JonathanSchaeffer, Wim Pijls and Arie deBruin

TechnicalReport TR 94±18December1994

DEPARTMENT OF COMPUTING SCIENCEThe University of Alber taEdmonton, Alber ta, Canada

No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 2: MTD-f

iNo license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 3: MTD-f

A NewParadigmfor Minimax Search

AskePlaat,ErasmusUniversity, [email protected], Universityof Alberta, [email protected]

Wim Pijls, ErasmusUniversity, [email protected] deBruin, ErasmusUniversity, [email protected]

ErasmusUniversity, Universityof Alberta,Departmentof ComputerScience, Departmentof ComputingScience

P.O.Box 1738, 615GeneralServicesBuilding,3000DR Rotterdam, Edmonton,Alberta,

TheNetherlands CanadaT6G2H1

Abstract

This paperintroducesa new paradigmfor minimax game-treesearchalgo-rithms.MT isamemory-enhancedversionof Pearl'sTestprocedure.By changingtheway MT is called,a numberof best-®rstgame-treesearchalgorithmscanbesimplyandelegantlyconstructed(includingSSS*).

Most of theassessmentsof minimaxsearchalgorithmshavebeenbasedonsimulations.However, thesesimulationsgenerallydonotaddresstwo of thekeyingredientsof highperformancegame-playingprograms:iterativedeepeningandmemoryusage.This paperpresentsexperimentaldatafrom threegame-playingprograms(checkers,Othello andchess),coveringthe rangefrom low to highbranchingfactor. The improvedmoveorderingdueto iterativedeepeningandmemoryusageresultsin signi®cantlydifferentresultsfrom thoseportrayedin theliterature.WhereassomesimulationsshowAlpha-Betaexpandingalmost100%moreleaf nodesthanotheralgorithms[9], our resultsshowedvariationsof lessthan20%.

Onenew instanceof our framework(MTD-f) out-performsour bestalpha-betasearcher(aspirationNegaScout)on leaf nodes,total nodesandexecutiontime. To our knowledge,thesearethe ®rstreportedresultsthat comparebothdepth-®rstandbest-®rstalgorithmsgiventhesameamountof memory.Keywords: Minimax-treesearchalgorithms,Alpha-Beta,SSS*.

iiNo license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 4: MTD-f

Contents

1 Intr oduction 1

2 Memory-enhancedTest 22.1 A BinaryDecisionProcedure . . . . . . . . . . . . . . . . . . . . . . 22.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Drivers for MT 73.1 Elegance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.2 Ef®ciency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.3 MemoryUsage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.4 StartingValue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

4 Experiments 134.1 PreviousWork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134.2 ExperimentDesign. . . . . . . . . . . . . . . . . . . . . . . . . . . . 144.3 BaseLine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154.4 ExperimentResults . . . . . . . . . . . . . . . . . . . . . . . . . . . 154.5 ExecutionTime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194.6 MTD-best. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5 Conclusions 20

iiiNo license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 5: MTD-f

1 Intr oduction

For over 30 years,Alpha-Betahasbeenthe algorithmof choicefor searchinggametrees. Using a simpleleft-to-right depth-®rsttraversal,it is ableto ef®cientlysearchtrees[8, 14]. Severalimportantenhancementswereaddedto the basicAlpha-Betaframework,includingiterativedeepening,transpositiontables[24], thehistoryheuris-tic [22] andminimal window searching[2, 13, 18]. Severalstudiesshowedthat thealgorithm's ef®ciencywas approachingoptimal and that therewas little room forimprovement[4, 5, 22].

In 1979,Stockmanintroducedthe SSS* algorithm which was able to provablyexpandfewerleafnodesthanAlpha-Betaby adoptingabest-®rstsearchstrategy[25].Scienti®cskepticismquickly followed as it becameevident that SSS* had seriousimplementationdrawbacks.Theseproblemsrevolvedaroundthe OPENlist, a datastructurewhosesizewasexponentialin thedepthof thesearchtree,andtheexpensiveoperationsrequiredto maintainthelist in sortedorder[20]. Nevertheless,thepotentialfor buildingsmallersearchtreeshadbeendemonstrated.Unfortunately, it seemedthatSSS*wasaninterestingideafor thetheoreticians,but it failedto becomeanalgorithmusedby thepractitioners.

This paperintroducestheMemory-enhancedTest(MT) algorithm,a variationonPearl's Testprocedure[13]. This routineef®cientlysearchesa gametreeto answera binary question. MT canbe calledfrom a simpledriver routinewhich may makerepeatedcalls to MT. By usingmemoryto storepreviouslyseennodes(a standardtranspositiontable),MT canbeusedef®cientlyto re-searchtrees.

A searchyielding a singlebinarydecisionis usuallynot usefulfor game-playingprograms.Repeatedcalls to MT canbeusedto homein on the minimaxvalue. ByconstructingdifferentdriverprogramsthatcallMT, differentalgorithmscanbecreated.In particular, a variety of best-®rstalgorithmscanbe implementedusingdepth-®rstsearch,suchasSSS*andDUAL* [9,19]. Thesurprisingresultis thataseriesof binary-valuedsearchesis moreeffectiveatdeterminingtheminimaxvaluethansearchesovera wide rangeof values. It is interestingto notethat the basicAlpha-BetaalgorithmcannotbecreatedusingMT, becauseits widesearchwindowcausesfewercutoffs thanMT-basedalgorithms.

TheMT drivers(MTD) caneasilybechangedto createnewalgorithms.Somenewalgorithmsareintroducedin this paperandoneof them,MTD-f, out-performsSSS*,DUAL*, Alpha-BetaandPVS/NegaScout[2, 18] onaverage.While SSS*uses+∞ asits initial searchboundandDUAL* uses ∞, MTD-f usesthe resultof thepreviousiterationin aniterativedeepeningsearch.Startingwith aboundcloserto theexpectedoutcomeincreasessearchef®ciency, aswewill show.

Most papersin the literaturecomparegame-treesearchalgorithmsusingsimula-tions. Typically, thesimulationsdo not useiterativedeepeningor transpositiontables(for example,[7, 9, 12]). Sincethesearenecessaryingredientsfor achievinghighper-formancein realapplications,this is a seriousomission.Ratherthanusesimulations,our resultshavebeentakenfrom threegame-playingprograms. Theseinclude thecheckersprogramChinook(a gamewith low branchingfactor), theOthelloprogramKeyano(mediumbranchingfactor) andthe chessprogramPhoenix(high branchingfactor). All threeprogramsarewell-knownin their respectivedomains.Experimen-tally comparingavarietyof best-®rstanddepth-®rstalgorithmsusingthesamestorage

1No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 6: MTD-f

requirementspaintsaverydifferentpictureof therelativestrengthsof thealgorithms.In particular, theresultsof non-iterativedeepeningexperimentsreportedin thelitera-turearemisleading.With iterativedeepeningandtranspositiontables,moveorderingis improvedto thepointwhereall thealgorithmsstartapproachingtheminimalsearchtreeandthedifferencesbetweenthealgorithmsbecomelesspronounced.

Since binary decisionproceduresare shown to be more effective than wide-windowedsearches,thereis no goodreasonfor continuedusageof wide-windowedAlpha-Beta. Although every introductorybook on arti®cial intelligencediscussesAlpha-Beta,we would arguethatMT is a simpleralgorithm,is easierto understand,andis moreef®cientthanAlpha-Beta. To put it bluntly, why would anyonewant touseAlpha-Beta?

2 Memory-enhancedTest

This sectiondiscussestheadvantagesof usinga binarydecisionprocedurefor doinggame-treesearch.

2.1 A BinaryDecisionProcedurePearlintroducedtheconceptof aproofprocedurefor gametreesin hisScoutalgorithm[15]. A proof proceduretakesa searchassertion(for example,theminimaxvalueofthetreeis ≥ 10) andreturnsa binaryvalue(assertionis trueor false). It wasinventedto simplify theanalysisof theef®ciencyof Alpha-Beta. In doing theanalysis,Pearldiscoveredthatanalgorithmusingabinarydecisionprocedurecouldsearchtreesmoreef®ciently. Theproof procedurewascalledTest,sinceit wasusedto testwhethertheminimaxvalueof asubtreewould lie aboveor belowaspeci®edthreshold.ScoutwasthedriverprogramthatcalledTest.Enhancementsto thealgorithmweredevelopedandanalyzed(thisiswell documentedin [19]). It turnsoutthatScout-basedalgorithmswillneverconsidermoreuniqueleaf nodesthanwould Alpha-Beta. For thespecialcaseof aperfectlyorderedtreebothAlpha-BetaandtheScout-variantssearchtheso-calledminimal search tree [8]. Simulationshaveshownthat, on average,Scout-variants(suchasNegaScout)signi®cantlyout-performAlpha-Beta[7, 9]. However, whenusedin practicewith iterativedeepening,aspirationsearchingandtranspositiontables,thequality of themoveorderinggreatlyincreases.As a result,therelativeadvantageofNegaScoutsigni®cantlydecreases[22].

Testdoesanoptimaljob of answeringa binary-valuedquestion.However, game-treesearchesareusuallyoverarangeof values.Wewouldlike to usetheef®ciencyofTestto®ndtheminimaxvalueof asearchtree.RepeatedcallstoTestwill beinef®cient,unlessTestis modi®edto reusetheresultsof previoussearches.EnhancingTestwithmemoryyieldsanewalgorithmwhichwecall MT (shortfor Memory-enhancedTest),asillustratedin ®gure1. Thestoragecanbeorganizedasa familiar transpositiontable[10]. Beforea nodeis expandedin a search,a checkis madeto seeif thevalueof thenodeis availablein memory, theresultof a previoussearch.Lateronwewill seethataddingstoragehassomeotherbene®tsthatarecrucialfor thealgorithm's ef®ciency.

We would like to formulateour versionof Testasclearandconciseaspossible,usingtheoftenusedNegamaxformulation[8]. Hereweencounteraproblem.Supposewewantto testwhethertheminimaxvalueof maximizingnoden is at leastω. A callTest(n, ω) would do this. A child returninga value≥ ω is suf®cientto answerthe

2No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 7: MTD-f

question,andthereforecausesacutoff. Unfortunately, for aminimizingnode,acutoffshouldoccuronly whenits returnvalue g > ω (usinga Negamaxformulation).Thereis a simpleway of removingtheª>º/ª ≥º asymmetry:This canbeobtainedbyusinginputparametersof theform ω ε andω + ε, whereε is avaluesmallerthanthedifferencebetweenanytwo leafnodeevaluations.ThenMT(n, ω ε) guaranteesthatall nodesin thetreeareeithergreaterthanor lessthanω ε; no tiesarepossible.So,therearetwo differencesbetweenMT andTest. Oneis thatMT is calledwith ω εinsteadof ω, sothatthereis noneedfor theª=º partin thecutoff test,obviatingextratestsin thecodeof MT or arti®cialincrementing/decrementing of theinputparameter(asin [11]). Thesecond(andmorefundamental)differenceis thatMT usesstoragetopasson searchresultsfrom onepassto thenext,makingef®cientmulti-passsearchespossible.

Figure1 showsthepseudo-codefor MT. Theroutineassumesanevaluateroutinethat assignsa value to a node. Determiningwhen to call evaluateis application-dependentand is hiddenin the de®nitionof the conditionn = leaf. For a depthd®xed-depthsearch,a leaf is anynodethat is d movesfrom the root of the tree. Thesearchreturnsanupperor lower boundon thesearchvalueat eachnode,denotedbyƒ+ andƒ respectively. Beforesearchinganode,thetranspositiontableinformationisretrievedand,if it hasbeenpreviouslysearcheddeepenough,thesearchis cutoff. Atthecompletionof a node,theboundon thevalueis stored in the transpositiontable.Theboundsstoredwith eachnodearedenotedusingPascal's dot-notation.

function MT(n, γ) → g;{ precondition:γ = anyleaf-evaluation;MT mustbecalledwith γ = ω ε to proveg < ω or g ≥ ω }

if retrieve(n) = foundthenif n.ƒ > γ then return n.ƒ ;if n.ƒ+ < γ then return n.ƒ+;

if n = leaf thenn.ƒ+ := n.ƒ := g := evaluate(n);

elseg := ∞;c := ®rstchild(n);while g < γ and c = do

g := max(g, MT(c, γ));c := nextbrother(c);

if g < γ then n.ƒ+ := g elsen.ƒ := g;store(n);return g;

Figure1: MT, amemory-enhancedversionof Pearl's Test.

Usually we want to know morethanjust a boundon the minimax value. Usingrepeatedcallsto MT, thesearchcanhomein ontheminimaxvalueuntil it is found.Toachievethis,MT mustbecalledfromadriverroutine.Oneideafor suchadriverwould

3No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 8: MTD-f

function MTD+∞(n) → ƒ;g := +∞;repeat

bound:= g;g := MT(n, bound ε);

until g = bound;return g;

Figure2: MTD+∞: A sequenceof MT searchesto ®ndƒ.

beto startatanupperboundfor thesearchvalue,ƒ+ = +∞. Subsequentcallsto MT canlower this bounduntil theminimaxvalueis reached.Figure2 showsthepseudocodefor this driver calledMTD+∞. Thevariableg is at all timesanupperboundƒ+ on theminimaxvalueof the root of thegametree[17]. Surprisingly, MTD+∞ expandsthesameleaf nodesin the sameorderasSSS*,providedthat no collisionsoccurin thetranspositiontable,andthatits sizeis of orderO(w d/2 ) [3, 17].

2.2 ExampleThe following exampleillustrateshow MTD+∞ traversesa tree as it computesasequenceof upperboundson thegamevalue. For easeof comparison,the exampletreein ®gure3 is thesameashasbeenusedin otherpapers[15]. Figures4±7showthefour stagesinvolvedin building thetree. In these®gures,theg valuesreturnedbyMT aregivenbesideeachinterior node. Maximizing nodesaredenotedby a square;minimizingnodesby a circle. Thediscussionusesthecodein ®gures1 and2.

a

b

c

d

e

41l

5

ƒ

g

12 90 101 80 20 30

h

i

j

k

34 80

l

m

36o

35

p

q

r

50s

36 25 3

Figure3: Exampletreefor MTD+∞.

First Pass:(®gure4)MTD+∞ initially startswith a valueof +∞ (1000for our purposes).MT expandsallbranchesatmaxnodesandasinglebranchataminnode.In otherwords,themaximumnumberof cutoffs occursinceall valuesin the treeare< 1000 ε (< ∞). Figure4showstheresultingtreethatis searchedandstoredin memory. Theminimaxvalueofthistreeis 41. Notethatatmaxnodesa, c andi anƒ+ valueis savedin thetransposition

4No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 9: MTD-f

ƒ+ = 41 a

ƒ = 41 b

ƒ+ = 41 c

ƒ = 41 d

e

41

ƒ ƒ = 12

g

12

h ƒ = 36

i ƒ+ = 36

ƒ = 34 j

k

34

l ƒ = 36

m

36

Figure4: Pass1.

ƒ+ = 36 a

ƒ = 12 b

ƒ+ = 12 c

ƒ = 5 d

e

41

n

5

ƒ = 12

ƒskipped

ƒ = 36

h

skipped

Figure5: Pass2.

table,while anƒ valueis savedatmin nodesb, d, ƒ,h, j andl. Bothboundsarestoredat leaf nodese, g, k andm sincethe minimax value for that nodeis exactlyknown.Thesestorednodeswill beusedin thesubsequentpasses.SecondPass:(see®gure5)MT is calledagainsincethe previouscall to MT improvedthe upperboundon theminimaxvaluefrom 1000to 41 (thevariableg in ®gure2). MT now attemptsto ®ndwhetherthe treevalueis lessthanor greaterthan41 ε. The left-mostpath fromthe root to a leaf, whereeachnodealongthat pathhasthesameg valueasthe root,is calledthe critical path or principal variation. The patha, b, c, d down to e is thecritical paththatis descendedin thesecondpass.edoesnothaveto bereevaluated;itsvaluecomesfrom thetranspositiontable.Sinced is aminimizingnode,the®rstchildedoesnotcauseacutoff (value< (41 ε)) andchild n mustbeexpanded.n's valuegetsbackedup to c, who thenhasto investigatechild ƒ. Theboundon ƒ, computedin the previouspass,causesthe searchto stopexploringthis branchimmediately. ctakeson themaximumof 12and5, andthisbecomesthenewvaluefor b. Sinceh hasavalue< 41 ε (from thepreviouspass),thesearchis complete;bothof a's childrenprovethata hasa valuelessthan41. Theresultingtreede®nesa newupperboundof

5No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 10: MTD-f

36on theminimaxvalue.

ƒ+ = 35 a

ƒ = 12

b

skipped

h ƒ = 35

i ƒ+ = 35

ƒ = 34

j

skipped

l ƒ = 35

m

36o

35

Figure6: Pass3.

Third Pass:(see®gure6)The searchnow attemptsto lower the minimax valuebelow 36. From the previoussearch,weknowb hasavalue< 36buth doesnot. Thealgorithmfollows thecriticalpathto thenodegiving the36 (h to i to l to m). m's valueis known,andit is not lessthan36. Thusnodeo mustbe examined,yielding a valueof 35. From i's point ofview, thenewvaluefor l (35) andtheboundon j 's value(34 from the®rstpass)arebothlessthan36. i getsthemaximum(35) andthis getspropagatedto theroot of thetree.Theboundon theminimaxvalueat theroothasbeenimprovedfrom 36 to 35.FourthPass:(see®gure7)Theprevioussearchloweredtheboundfrom 36 to 35, meaningconvergencehasnotoccurredandthe searchmustcontinue. The searchfollows the critical patha, h, i, lando. At nodel, bothits childrenimmediatelyreturnwithouthavingbeenevaluated;theirvalueis retrievedfrom thetranspositiontable.Notethatthepreviouspassstoredanƒ valuefor l, while this passwill storea ƒ+. Thereis roomfor optimizationhere

a ƒ = 35

ƒ = 12

b

skipped

h ƒ+ = 35

ƒ = 35 i

ƒ = 34

j

skipped

l ƒ+ = 35

m

36o

35

p ƒ = 36

q ƒ+ = 36

r

50s

36

Figure7: Pass4.

6No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 11: MTD-f

by recognizingthatall of l's childrenhavebeenevaluatedandthusweknowtheexactvaluefor l (see[16, 18]). The valueof l doesnot changeand j 's boundprecludesitfrom beingsearched,thusi's valueremainsunchanged.i cannotlower h's value(nocutoff occurs),so thesearchexploresp. p considersq which, in turn, mustsearchrands. Sincep is a maximizingnode,thevalueof q (36) causesa cutoff. Both of h'schildrenaregreaterthan35 ε. Nodea wassearchedattemptingto showwhetherit' s valuewaslessthanor greaterthan35 ε. h providesthe answer;greaterthan.This call to MT fails high, meaningwe havea lower boundof 35 on thesearch.Thepreviouscall to MT establishedanupperboundof 35. Thustheminimaxvalueof thetreeis provento be35.

2.3 DiscussionMTD+∞ causesMT to expandthesameleafnodesin thesameorderasSSS*(aproofcanbefound in [17]). How doesMT, a depth-®rstsearchprocedure,examinenodesin a best-®rstmanner?Thevalueof ƒ+ ε causesMT to exploreonly nodesthatcanlower theupperboundat theroot. This is thebest-®rstexpansionorderof SSS*: inthethird passof theexample,loweringthevalueof nodeb cannotin¯uenceƒ+(a), butnodeh can.Selectingnodeh givesabest-®rstexpansionorder.

Theexampleillustratesthatstorageiscritical to theperformanceof amulti-passMTalgorithm. Without it, theprogramwould revisit interior nodeswithout thebene®tofinformationgainedfrom theprevioussearch.Instead,MT canretrievevaluablesearchinformationfor anode,suchasanupperor lowerbound,usingarelativelycheaptablelookup. The storagetableprovidestwo bene®ts:(1) preventingunnecessarynodere-expansion,and(2) guidingMT alongthecritical path,ensuringabest-®rstselectionscheme.Botharenecessaryfor theef®ciencyof thealgorithm.

Onecouldaskthe questionwhethera simpleone-passAlpha-Betasearchwouldnot be as ef®cient. To seewhy MT is more ef®cientit helpsto think of MT as anull-window Alpha-Betasearch(β = α + 1) with a transpositiontable. Null-windowsearchinghasbeenshownequivalentto Test and hasbeendiscussedby numerousauthors[2, 9, 15, 19]. Variouspaperspoint out that a tighter Alpha-Betawindowcausesmorecutoffs thana wider window, all otherthingsbeingequal(for example,[2, 17]). SinceMT doesnot re-expandnodesfrom a previouspass,it cannothavefewercutoffs thanwide-windowedAlpha-Betafor newleaf nodes.This impliesthatanysequenceof MT callswill bemoreef®cient(it will neverexpandmoreleafnodesandusuallysigni®cantlylessnodes)thanacall to Alpha-Betawith window( ∞, +∞).

3 Drivers for MT

Havingseenonedriver for MT, theideascanbeencompassedin a generalizeddriverroutine.Thedrivercanberegardedasprovidingaseriesof callsto MT to successivelyre®neboundson theminimaxvalue.

The driver codecanbe parameterizedso that onepieceof codecanconstructavarietyof algorithms.Thetwo parametersneededare:

®rst The®rststartingboundfor MT.

next A searchhasbeencompleted.Useits resultto determinethenextboundfor MT.

7No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 12: MTD-f

function MTD(®rst,next,n) → ƒ;ƒ+ := +∞; ƒ := ∞;bound:= g := ®rst;repeat

{ Thenextoperatormustsetthevariablebound}next;g := MT(n,bound ε);if g < boundthen ƒ+ := g elseƒ := g;

until ƒ = ƒ+;return g;

Figure8: A frameworkfor MT drivers.

Usingtheseparameters,analgorithmusingourMT driver, MTD, canbeexpressedasMTD(®rst,next). Thecorrespondingpseudocodecanbefoundin ®gure8. A numberof interestingalgorithmscaneasilybeconstructedusingMTD. In the following, +∞and ∞ areusedastheupperandlower boundson therangeof leaf values.In actualimplementations,theseboundsaresuitablylarge®nitenumbers.

• SSS*[MTD+∞]MTD+∞ is justSSS*andcanbedescribedas:

MTD(+∞, bound:= g).

• DUAL* [MTD ∞]Theliteraturedescribesadualversionof SSS*,whereminimizationis replacedby maximization,theOPENlist is kept in reversedorder, andthestartvalueis

∞ [9, 19]. Thisalgorithmbecomes:

MTD( ∞,bound:= g + 1).

Theadvantageof DUAL* overSSS*liesin thesearchof odd-depthsearchtrees.Thetreesthatde®nesucha boundarecalledsolutiontrees[16, 25]. An upperboundis de®nedby a maxsolutiontree,a lower boundby a min solutiontree.In a maxsolutiontreeall childrenof maxnodesareincorporated,andexactlyonechild of amin node.Dually for amin solutiontree.

SinceSSS*buildsandre®nesa maxsolutiontreeof sizeO(w d/2 ) on uniformtrees,DUAL* buildsandre®nesmin solutiontreesof sizeO(w d/2 ) [19]. Sincethemin solutiontreesaresmaller, fewerleafnodesareexpandedandlessstorageis required. If, for example,w = 40 andd = 9 thenit makesa big differencewhetheronehasto searchandstorea treewith 404 or 405 leaves.

(Alternatively, for reasonsof symmetry, we could replacethe seventhline of®gure8 by g := MT(n, bound+ ε). In this way DUAL* canbe expressedasMTD( ∞,bound:= g), which is morelike thedriver for SSS*.

• MTD-biSinceMT can be usedto searchfrom above(SSS*) as well as from below

8No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 13: MTD-f

(DUAL*), anobvioustry is to bisecttheintervalandstartin themiddle. Sinceeachpassproducesanupperor lower bound,we cantakesomepivot valueinbetweenasthenextcenterfor oursearch.In MTD terms,bisectingtherangeofinterestbecomes:

MTD(avg(+∞, ∞), avg(ƒ+, ƒ ))

whereavgdenotestheaverageof two values.To reducebig swingsin thepivotvalue,somekind of aspirationsearchingwill bebene®cialin manyapplicationdomains[22].

Weill introducedasimilaralgorithmwhichhenamedNegaC*[26]. Hedoesnotstatethelink with best-®rstSSS*-likebehavior, nor thefact thatthis algorithm,providedthereis enoughstorage,will neverexpandmoreleafnodesthanAlpha-Beta.

• MTD-fRatherthanarbitrarily usingthe mid-point asan approximation,any informa-tion on wherethe valueis likely to lie canbe usedasa betterapproximation.Giventhatiterativedeepeningis usedin manyapplicationdomains,theobviousapproximationfor theminimaxvalueis theresultof thepreviousiteration. InMTD termsthisalgorithmbecomes:

MTD(approximation,if g < boundthen bound:= g elsebound:= g + 1).

MTD-f can be viewed as startingcloseto ƒ, and then doing either SSS* orDUAL*, skippinga largepartof theirsearchpath.

• MTD-stepInsteadof makingtiny jumpsfrom oneboundto the next, as in all the abovealgorithmsexceptMTD-bi, wecouldmakebiggerjumps:

MTD(+∞,bound:= max(ƒroot + 1, g stepsize))

(or thedualversion)wherestepsizeis somesuitablylargevalue.

• MTD-bestIf wearenot interestedin thegamevalueitself, butonly in thebestmove,thenastopcriterionsuggestedby Berlinercanbeused[1]. Wheneverthelowerboundof onemoveis not lower thantheupperboundsof all othermoves,it is certainthat this mustbe the bestmove. To provethis, we haveto do lesswork thanwhenwesolvefor ƒ, sincenoupperboundon thevalueof thebestmovehastobecomputed.We canuseeithera prove-beststrategy(establisha lower boundononemoveandthentry to createanupperboundontheothers)ordisprove-rest(establishanupperboundon all movesthoughtto be inferior andtry to ®ndalower boundon the remainingmove). The stopcriterion in ®gure8 mustbechangedto ƒbestmove≥ ƒ+

othermoves. Notethatthis strategyhasthepotentialto buildsearchtreessmallerthantheminimalsearchtree.

9No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 14: MTD-f

Dual*

ƒ

ƒ

ƒ

SSS*

+∞

ƒ+

ƒ+

ƒ

MTD-step

+∞

ƒ+

ƒ

ƒ

+∞

∞ƒ

ƒ+

ƒ

ƒ+

ƒ+

ƒ

ƒ

MTD-bi MTD-f

+∞

∞ƒ

ƒ+

h

ƒ

ƒ

Figure9: MT-basedalgorithms.

Theknowledgewhichmoveshouldberegardedasbest,andasuitablevaluefor a®rstguess,mightbeobtainedfrom apreviousiterationin aniterativedeepeningscheme.Thenotionof whichmoveis bestmightchangeduringthesearch.Thismakesfor aslightly morecomplicatedimplementation.

Notethatwhile all theabovealgorithmsusea transpositiontable,not all of themneedto savebothƒ+ andƒ values.Forexample,sinceSSS*alwaystriesto lowertheboundat theroot, thetranspositiontableneedsto storeonly onebound.

Figure9 illustratesthedifferentstrategiesusedby theabovealgorithmsfor con-verging on theminimaxvalue.

TheMTD frameworkhasa numberof importantadvantagesfor reasoningaboutgame-treesearchalgorithms.

3.1 EleganceFormulatingaseeminglydiversecollectionof algorithmsinto oneunifying frameworkfocusesattentionon thefundamentaldifferences.Forexample,theframeworkallowsthereaderto seejust how similar SSS*andDUAL* really are,andthatthesearejustspecialcasesof calling Test. Thedriversconciselycapturethealgorithmdifferences.MTD offers us a high-levelparadigmthat facilitatesthe reasoningaboutimportantissueslike algorithm ef®ciencyand memoryusage,without the needfor low-leveldetails.

3.2 Ef®ciency

All thealgorithmspresentedarebasedonMT. SinceMT isequivalenttoanull-windowAlpha-Betacall (plusstorage),theysearchlessnodesthantheinferiorone-passAlpha-Beta( ∞, +∞) algorithm(seealso[17]).

3.3 MemoryUsageAn importantissueconcerningtheef®ciencyof MT-basedalgorithmsismemoryusage.SSS*canbe regardedasmanipulatingonemax solutiontreein place[17]. A max

10No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 15: MTD-f

solutiontreehasonly onesuccessorateachmin nodeandall successorsatmaxnodes,while theconverseis truefor minsolutiontrees.Whenevertheupperboundis lowered,anew(better)subtreehasbeenexpanded.Thesubtreerootedin theleft brotherof thenewsubtreecanbe deletedfrom memory, sinceit is inferior [9, 21]. Sincethe newbrotheris better, thepreviousbrothernodeswill nevercontainthecritical pathagainandwill neverbe explored(other thanas transpositions).For a max solution tree,this implies that thereis alwaysonly one live child at a min node;bettermovesjustreplaceit. Givena branchingfactorof w anda treeof depthd, thespacecomplexityof a driver causingMT to constructandre®neonemax solutiontreeis thereforeoftheorderO(w d/2 ), andadrivermanipulatingonemin solutiontreeis orderO(w d/2 ).Someof thealgorithms,suchasMTD-bi, usetwo bounds.Theserequirea maxanda min solutiontree, implying that the storagerequirementsareO(w d/2 ). A simplecalculationandempiricalevidenceshowthis to berealisticstoragerequirements.

A transpositiontableprovidesa¯exiblewayof storingsolutiontrees.While atanytimeentriesfrom old (inferior) solutiontreesmayberesident,theywill beoverwrittenby newerentrieswhentheirspaceis needed.Garbagewill collectedincrementally. Aslongasthetableis big enoughto storethemin or amaxsolutiontreesthatareessentialfor the ef®cientoperationof the algorithm, it providesfor fast accessand ef®cientstorage.(Not to mentionthemoveorderingandtranspositionbene®ts.)

3.4 StartingValuePerhapsthe biggestdifferencein theMTD algorithmsis their ®rstapproximationoftheminimaxvalue:SSS*is optimistic,DUAL* is pessimisticandMTD-f is realistic.Onaverage,thereis arelationshipbetweenthestartingboundandthesizeof thesearchtreesgenerated.Wehaveconductedteststhatshowthatasequenceof MT searchesto®ndthegamevaluebene®tsfrom astartvaluecloseto thegamevalue.Valueslike +∞or ∞ asin SSS*andin DUAL* arein asensetheworstpossiblechoices.

By doingminimalwindowsearches,wehopetoestablishtheleft-mostminsolutiontreeandtheleft-mostmaxsolutiontreeasef®cientlyaspossible.Doingsearcheswith adifferentwindowcancausenodesto beexpandedthatareirrelevantfor thisproof. Wehavebeenunableto formulatethisanalytically, andinsteadpresentempiricalevidenceto supportthisconjecture.

Figure10 validatesthe choiceof a startingboundcloseto the gamevalue. The®gureshowsthepercentageof uniqueleaf evaluationsof iterativedeepeningMTD-f.Thedatapointsaregivenasapercentageof thesizeof thesearchtreebuilt by ourbestAlpha-Beta-searcher(AspirationNegaScout).(Sinceiterativedeepeningalgorithmsareused,thecumulativeleaf countoverall previousdepthsis shownfor thedepths.)Givenan initial guessof h andtheminimaxvalueof ƒ, theoptimalsearchvalueforMTD-f, thegraphplotsthesearcheffort expendedfor differentvaluesof h ƒ. To theleft of thegraph,MTD-f is closerto DUAL*, to theright it is closerto SSS*. To bebetterthanAspirationNegaScout,analgorithmmustbelessthanthe100%baseline.A®rstguessclosetoƒmakesMTD-f performbetterthanthe100%AspirationNegaScoutbaseline.Theguessmustbecloseto ƒ for theeffect to becomesigni®cant.Thus,ifMTD-f is to be effective, the ƒ obtainedfrom the previousiterationmustbe a goodindicatorof thenextiteration's value.

11No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 16: MTD-f

88

90

92

94

96

98

100

102

104

-40 -20 0 20 40

Cum

ulat

ive

Leav

es R

elat

ive

to A

sp N

S (

%)

Difference of first guess from f

Checkers, Average over 20 Treesdepth 13depth 15

94

96

98

100

102

104

106

108

110

-30 -20 -10 0 10 20 30

Cum

ulat

ive

Leav

es R

elat

ive

to A

sp N

S (

%)

Difference of first guess from f

Othello, Average over 20 Treesdepth 8depth 9

84

86

88

90

92

94

96

98

100

-40 -20 0 20 40

Cum

ulat

ive

Leav

es R

elat

ive

to A

sp N

S (

%)

Difference of first guess from f

Chess, Average over 20 Treesdepth 6depth 7

Figure10: Treesizerelativeto the®rstguessƒ.

12No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 17: MTD-f

4 Experiments

Therearethreewaysto evaluatea new algorithm: analysis,simulationor empiricaltesting. The emphasisin the literaturehasbeenon analysisandsimulation. This issurprisinggiventhelargenumberof game-playingprogramsin existence.

4.1 PreviousWorkThemathematicalanalysisof minimaxsearchalgorithmsdo a goodjob of increasingourunderstandingof thealgorithms,butfail togivereliablepredictionsof performance(for example,[6, 15, 16]). Theproblemis thatthegametreesthatareanalyzeddifferfrom thetreesgeneratedby realgame-playingprograms.

To overcomethisde®ciency, anumberof authorshaveconductedsimulations(forexample,[7, 9, 12, 19]). In our opinion,thesimulationsdid not capturethebehaviorof realisticsearchalgorithmsastheyareusedin game-playingprograms.Instead,wedecidedtoconductexperimentsin asettingthatwastobeasrealisticaspossible.Theseexperimentsattemptto addresstheconcernswehavewith thesimulationparameters:

• Variablebranchingfactor: simulationsusea®xedbranchingfactor.

• High degreeof ordering:mostsimulationshavethequalityof theirmoveorder-ing belowwhatis seenin realgame-playingprograms.

• Iterativedeepening:simulationsuse®xed-depthsearching.Game-playingpro-gramsuseiterativedeepeningto seedmemory(transpositiontable)with bestmovesto improvethemoveordering.This addsoverheadto thesearch,whichis morethanoffsetby theimprovedmoveordering.

Iterativedeepeningimposesa dynamicorderingon the tree. This meansthatdifferentalgorithmswill searcha treedifferently. Thus,proofsthatone®xed-depthalgorithmsearcheslessleavesthananother®xed-depthalgorithm(suchas[25]), donothold for iteratively-deepened searches.

• Memory: simulationsassumeeitherno storageof previouslycomputedresults,or unfairly bias their experimentsby not giving all the algorithmsthe samestorage. For iterativedeepeningto be effective, bestmove informationfrompreviousiterationsmust be savedin memory. In game-playingprogramsatranspositiontable is used. For bestperformance,this table may haveto beof considerablesize,sincethe moveorderinginformationof the nodesof theprevioussearchtree mustbe saved(which is exponentialin the depthof thesearch).

• Treesize:simulationsoftenuseaninconsistentstandardfor countingleafnodes.In conventionalsimulations(for example,[9]) eachvisit toaleafnodeiscountedfor depth-®rstalgorithmslike NegaScout,whereastheleaf is countedonly oncefor best-®rstalgorithmslike SSS* (becauseit was storedin memory, no re-expansionoccurs). This problemis a resultof thememoryproblemdescribedabove.

• Executiontime: simulationsareconcernedwith treesize,but practitionersareconcernedwith executiontime. Simulationsresultsdonotnecessarilycorrelate

13No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 18: MTD-f

well with executiontime. For example,therearemanypapersshowingSSS*expandsfewer leaf nodesthanAlpha-Beta. However, SSS* implementationsusingStockman's original algorithmhavetoo muchexecutionoverheadto becompetitivewith Alpha-Beta.

• Valuedependence:somesimulationsgeneratethevalueof a child independentof thevalueof theparent.However, thereis usuallya highcorrelationbetweenthevaluesof thesetwo nodesin realgames.

4.2 ExperimentDesignTo assessthe feasibility of the proposedalgorithms,a seriesof experimentswasperformedto compareAlpha-Beta,NegaScout,SSS*(MTD+∞), DUAL* (MTD ∞),MTD-f, andMTD-best.

Ratherthanusesimulations,our datahasbeengatheredfrom threegame-playingprograms:Chinook(checkers)[23], Keyano(Othello)andPhoenix(chess)[21]. Allthreeprogramsarewell-knownin their respectivedomain. For our experiments,weusedtheoriginalauthor's searchalgorithmwhich,presumably, hasbeenhighly tunedto the application. The only changewe madewasto disablesearchextensionsand,in the caseof ChinookandPhoenix,forward pruning. All programsusediterativedeepening.TheMTD algorithmswouldberepeatedlycalledwith successivelydeepersearchdepths.All threeprogramsusedastandardtranspositiontablewith amaximumof 220 entries. Testingshowedthat the solutiontreescould comfortably®t in tablesof this size. For our experimentswe usedtheoriginal programauthor's transpositiontable data structuresand code without modi®cation. 1 At an interior node, themove suggestedby the transpositiontable is alwayssearched®rst(if known), andthe remainingmovesareorderedbeforebeingsearched.ChinookandPhoenixusedynamicorderingbasedon thehistoryheuristic[22], while Keyanousesstaticmoveordering.

All threeprogramsusetranspositiontableswith only onebound(in contrastwithourcodein ®gure1). SinceMTD-bi andMTD-stepmanipulatebothamaxanda minsolutiontree,andthereforeneedto storebothanupperanda lower boundat a node,we do not presentthesetwo algorithms. Oneof the pointswe arestressingin thispaperis easeof implementationof theMT-framework.Theamountof work involvedin alteringtheinformationthatis storedin thetranspositiontableswouldcompromisethisobjective.

The MT codegiven in ®gure1 hasbeenexpandedto include two details,bothof which arecommonpracticein game-playingprograms.The®rstis a searchdepthparameter. Thisparameteris initializedto thedepthof thesearchtree.AsMT descendsthesearchtree,thedepthis decremented.Leaf nodesareat depthzero. Thesecondis thesavingof thebestmoveat eachnode.Whena nodeis revisited,thebestmovefrom theprevioussearchis alwaysconsidered®rst.

Conventionaltestsetsin theliteratureprovedto bepoorpredictorsof performance.Positionsin testsetsareusuallyselectedto testa particularcharacteristicor propertyof thegame(suchastacticalcombinationsin chess)andarenotnecessarilyindicative

1As a matterof fact, sincewe implementedMT usingnull-window alpha-betasearches,we did nothaveto makeanychangesat all to thecodeotherthantheaforementioneddisablingof forwardpruningandsearchextensions.Weonly hadto introducetheMTD drivercode.

14No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 19: MTD-f

of typical gameconditions. For our experiments,the programsweretestedusingasetof 20 positionsthatcorrespondedto movesequencesfrom tournamentgames.Byselectingmovesequencesratherthanisolatedpositions,we areattemptingto createatestsetthat is representativeof realgamesearchproperties(includingpositionswithobviousmoves,hardmoves,positionalmoves,tacticalmoves,differentgamephases,etc.).

All threeprogramswere run on 20 test positions,searchingto a depthso thatall searchedroughly the sameamountof time. Becauseof the low branchingfactorChinookwasableto searchto depth15, iteratingtwo ply at a time. Keyanosearchedto 9 ply andPhoenixto 7, bothoneply ata time.

4.3 BaseLineMany papersin the literatureuseAlpha-Betaasthebase-linefor comparingtheper-formanceof otheralgorithms(for example,[2, 10]). Theimplicationis thatthis is thestandarddatapointwhicheveryoneis trying tobeat.However, game-playingprogramshaveevolvedbeyondsimpleAlpha-Betaalgorithms.Most useAlpha-Betaenhancedwith minimal window search(usuallyPVS[2] or NegaScout[18]), iterativedeepen-ing, transpositiontables,moveorderingandan initial aspirationwindow. Sincethisis thetypical searchalgorithmusedin high-performanceprograms(suchasChinook,PhoenixandKeyano),it seemsmorereasonableto usethisasourbase-linestandard.

Theworsethebase-linecomparisonalgorithmchosen,thebetterotheralgorithmsappearto be. By choosingNegaScoutenhancedwith aspirationsearching(AspirationNegaScout)asour performancemetric, we areemphasizingthat it is possibleto dobetter than the "best" methodscurrently practicedand that, contrary to publishedsimulationresults,somemethodsareinferior.

BecauseweimplementedtheMTD algorithms(like SSS*andDUAL*) usingMT(equivalentto null-window Alpha-Betacallswith a transpositiontable)we wereableto comparea numberof algorithmsthat werepreviouslyseenasvery different. Byusing MT as a commonproof-procedure,every algorithmbene®tedfrom the sameenhancementsconcerningiterativedeepening,transpositiontables,andmoveorderingcode.To ourknowledgethis is the®rstcomparisonof algorithmsdepth-®rstandbest-®rstminimaxsearchalgorithmswhereall thealgorithmsaregivenidenticalresources.

4.4 ExperimentResultsFigure11 showstheperformanceof Chinook,KeyanoandPhoenix,respectively,

using the numberof leaf evaluations(NBP or Numberof Bottom Positions)as theperformancemetric. Figures12 showtheperformanceof thesealgorithmsusingthenumberof nodesin thesearchtree(interior andleaf) asthemetric. Thegraphsshowthecumulativenumberof nodesoverall previousiterationsfor acertaindepth(whichis realisticsinceiterativedeepeningis used)relativeto AspirationNegaScout.SincewecouldnottestMTD-bi andMTD-stepproperlywithoutmodifyingthetranspositiontablecode,theseresultsarenotshown.Thesearchdepthsreachedbytheprogramsvarygreatlybecauseof thediffering branchingfactors.In checkers,theaveragebranchingfactor is approximately3 (therearetypically 1.2 movesin a captureposition;whileroughly8 in anon-captureposition),in Othelloit is 10andin chessit is 36.

15No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 20: MTD-f

85

90

95

100

105

110

115

2 4 6 8 10 12 14 16Depth

Chinook, Average over 20 Trees

AspNSAB

MTD-fDual*SSS*

85

90

95

100

105

110

115

2 3 4 5 6 7 8 9 10Depth

Keyano, Average over 20 Trees

AspNSAB

MTD-fDual*SSS*

85

90

95

100

105

110

115

2 3 4 5 6 7 8Depth

Chess - Leaves Relative to Aspiration NegaScout (%)

AspNSAB

MTD(f)Dual*SSS*

Figure11: Leafnodecount

16No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 21: MTD-f

90

100

110

120

130

140

150

2 4 6 8 10 12 14 16Depth

Chinook, Average over 20 Trees

AspNSAB

MTD-fDual*SSS*

90

100

110

120

130

140

150

2 3 4 5 6 7 8 9 10Depth

Othello - Total nodes Relative to Aspiration NegaScout (%)

AspNSAB

MTD(f)Dual*SSS*

100

120

140

160

180

200

2 3 4 5 6 7 8Depth

Chess, All nodes Relative to Aspiration NegaScout

AspNSAB

MTD(f)Dual*SSS*

Figure12: Totalnodecount

17No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 22: MTD-f

Over all threegames,the bestresultsarefrom MTD-f. Its leaf nodecountsareconsistentlybetterthanAspirationNegaScout,averagingat leasta 5% improvement.More surprisinglyis thatMTD-f outperformsAspirationNegaScouton thetotal nodemeasureaswell. Sinceeachiterationrequiresrepeatedcalls to MT (at leasttwo andpossiblymany more), one might expectMTD-f to perform badly by this measurebecauseof therepeatedtraversalsof thetree. This suggeststhatMTD-f, on average,is callingMT closeto theminimumnumberof times.Forall threeprograms,MT getscalledbetween3 and4 timesonaverage.In contrast,theSSS*andDUAL* resultsarepoorcomparedto NegaScoutwhenall nodesin thesearchtreeareconsidered.Eachof thesealgorithmsperformsdozensandsometimesevenhundredsof MT searches.

The successof MTD-f implies that it is better to start the searchwith a goodguessasto wheretheminimaxvaluelies, ratherthanstartingat oneof theextremes(+∞ or ∞). Clearly, MTD-f out-performsAlpha-Beta,suggestingthatbinary-valuedsearchesare more ef®cientthan wide-windowedsearches.Of more interestis theMTD-f andNegaScoutcomparison.If bothsearcheshavethecorrectbestmovefromthepreviousiteration,thenboth®ndthevalueof thatmoveandthenbuild proof treesto showtheremainingmovesareinferior. SinceNegaScoutdoesthis with a minimalwindow, it visits exactlythesameleaf nodesthat MT would. Sinceour results(andthosefrom simulations)showNegaScoutto besuperiorto Alpha-Beta,this suggeststhat limiting the useof wide-window searchesis bene®cial. MT takesthis to theextremeby eliminatingall wide-windowsearches.

What is theminimum amountof searcheffort that mustbe doneto establishtheminimaxvalueof thesearchtree? If we know thevalueof thesearchtreeis ƒ, thentwo searchesarerequired:MTD-f(ƒ ε), which fails highestablishinga lowerboundon ƒ, andMTD-f(ƒ + 1 ε), which fails low andestablishesanupperboundon ƒ. Ofthe algorithmspresented,MTD-f hasthe greatestlikelihood of doing this minimumamountof work. Theclosertheapproximationtoƒ, thelesstheworkthathastobedone(seealso®gure10). Consideringthis, it is not a surprisethatbothDUAL* andSSS*comeour poorly. Their initial boundsfor theminimaxvaluearegenerallypoor( ∞and+∞ respectively),meaningthat themanycalls to MT resultin signi®cantlymoreinterior nodes. In the literature,SSS*is regardedasgoodbecauseit expandsfewerleaf nodesthan Alpha-Beta,and bad becauseof the algorithm complexity, storagerequirementsandstoragemaintenanceoverhead.Our resultsgive theexactoppositeview: SSS*is easyto implementbecauseof theMT framework,usesasmuchstorageasAspirationNegaScout,andperformsgenerallyworsethanAspirationNegaScoutwhenviewedin thecontextof iterativedeepeningandtranspositiontables.DUAL* ismoreef®cientthanSSS*but still comesout poorly in all thegraphsmeasuringtotalnodecount.TheChinookleafcountgraphis instructive,sincehereiterativedeepeningSSS*isgenerallyworsethaniterativedeepeningAlpha-Beta.Thisresultrunsoppositeto boththeproofandsimulationsfor ®xeddepthSSS*andAlpha-Beta.An interestingobservationis thattheeffectivenessof SSS*appearsto beafunctionof thebranchingfactor;thelargerthebranchingfactor, thebetterit performs.

AspirationNegaScoutis betterthanAlpha-Beta.Thisresultis consistentwith [22]whichshowedaspirationNegaScoutto beasmallimprovementoverAlpha-Betawhentranspositiontablesanditerativedeepeningwereused.NegaScoutusesawide-windowsearchfor theprincipalvariation(PV)andall re-searches.Thewide-windowPVsearchresultgivesa good®rstapproximationto theminimaxvalue. Thatapproximationis

18No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 23: MTD-f

then usedto searchthe rest of the tree with minimal window searches,which areequivalentto MT calls. If theserefutationsearchesaresuccessful(no re-searchisneeded),thenNegaScoutdeviatesfrom MTD-f only in thewayit searchesthePV for avalue.MTD-f usesMT alsofor searchingthePV, andfor re-searchesaftera fail high.SinceMTD-f doesnotderiveits ®rstguessfrom thetreelike NegaScoutdoes,it mustget it externally. In our experimentsthe minimax valuefrom the previousiterativedeepeningiterationwasusedfor thispurpose.

Simulationresultsshowthatfor ®xeddepthsearches,without transpositiontablesanditerativedeepening,SSS*,DUAL* andNegaScoutaremajorimprovementsoversimple Alpha-Beta[7, 9, 19]. For example,one study showsSSS* and DUAL*building treesthat are half the size of thosebuilt by Alpha-Beta[9]. This is insharpcontrastto the resultsreportedhere. Why is theresucha disparity with thepreviouslypublishedwork? Includingtranspositiontablesanditerativedeepeningintheexperimentimprovesthesearchef®ciencyin two ways:

• improvemoveorderingsothatthelikelihoodof thebestmovebeingconsidered®rstat acutoff nodeis veryhigh,and

• eliminatelarge portionsof thesearchby havingthesearchtreebe treatedasasearchgraph.A paththat transposesinto a pathalreadysearchedcanreusethepreviouslycomputedresult.

Themoveorderingis improvedto theextentthatall algorithmsareconverging on theminimalsearchtree.

Howeffectiveis themoveordering?At anodewith acutoff, onlyonemoveshouldbeexamined.Datagatheredfrom Phoenix,Keyano,andChinookshowanaverageofaround1.2,1.2 and1.1 move,respectively, areconsideredat cut nodes.On average,over95%of the time themoveorderingis successful.Clearly, the low ply numbersshouldhaveverygoodorderingbecauseof theiterativedeepening(typically 97-98%).Of more interestare these®guresfor the higher ply numbers. Thesenodesoccurnearthebottomof the treeandusuallydo not havethebene®tof transpositiontableinformationto guidethe selectionof the ®rstmoveto consider. Nevertheless,eventhesenodeshaveasuccessratein excessof 90%(80%for Keyano,whichdoesnothavethe history heuristic). CampbellandMarslandde®nestrongly ordered tressto havethebestmove®rst80%of thetime[2]. Kaindl etal. called90%verystronglyordered[7]. Our resultssuggestthereshouldbe a new category, almostperfectlyordered,correspondingto over95%successof the®rstmoveat cutoff nodes.Given that thesimulationsusedpoorermoveorderingthanis seenin practice,it is notsurprisingthattheir resultshavelittle relationto ours.

4.5 ExecutionTimeThebottomline for practitionersisexecutiontime. Sincewedidnothavetheresourcesto run all our experimentson identicalandotherwiseidle machines,we do not showexecutiontime graphs.However, comparingresultsfor thesamemachineswe foundthatMTD-f is consistentlythefastestalgorithm. It is about5% fasterthanAspirationNegaScoutfor checkersandOthello,andabout10%fasterfor chess,(dependingin partonthequalityof thescoreof thepreviousiteration).Wefoundthatfor Phoenixprogram

19No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 24: MTD-f

executiontime is stronglycorrelatedto the leaf-nodemetric,while for ChinookandKeyanoit is stronglycorrelatedto thetotal-nodemetric.

4.6 MTD-bestThe algorithmsdiscussedso far aredesignedto ®ndthe minimax valueof the tree.MTD-bestis only interestedin thebestmoveandnot thebestvalue. Thusit hasthepotentialto build evensmaller trees. This algorithmusesthe valueof the previousiteration to try to establishonly a lower boundon the bestmove,and then (if thiswassuccessful)anupperboundon theothermoves.MTD-bestprovesthesuperiorityof thebestmoveby only constructinga lower bound. This algorithmsaveson nodeexpansionsbynotconstructinganupperboundonthevalueof thebestmove.Figure13showsthatMTD-bestcanbeslightly betterthanMTD-f. This resultmustbetakeninthe contextthat MTD-bestis trying to solvea simplerproblem: ®ndthe bestmove,not thebestvalue. If thechoiceof bestmoveis thesameasin thepreviousiteration,MTD-bestiseffectiveatprovingthatmoveisbest.Whenthecandidatebestmoveturnsoutnotto bebest,MTD-bestis inef®cientsinceit endsupdoingmorework becauseitsinitial assumption(thebestmove)is wrong. Thuson a givensearch,MTD-bestmaybemuchbetteror muchworsethanMTD-f. In thetestdata,MTD-bestwaseffectiveatprovingthebestmoveabouthalf of thetime. Sincethealgorithmis oftenperformingafew percentbetterthanMTD-f, thepotentialgainsof provingamovebestout-weighthepotentiallosses.

Our implementationof MTD-bestis just a ®rstattempt. We believethat thereisstill roomfor improvingthealgorithm.

5 Conclusions

Overthirty yearsof researchhavebeendevotedto improvingtheef®ciencyof alpha-betasearching. The MT family of algorithmsare comparativelynew, without thebene®tof intenseinvestigations.GiventhatMTD-f is alreadyout-performingourbestalpha-betabasedimplementationsin realgame-playingprograms,futureresearchcanonly makevariationson this algorithmmoreattractive. MT is a simpleandelegantparadigmfor highperformancegame-treesearchalgorithms.

Thepurposeof asimulationis to exactlymodelanalgorithmto gaininsightinto itsperformance.Simulationsareusuallyperformedwhenit istoodif®cult or tooexpensiveto constructtheproperexperimentalenvironment.Thesimulationparametersshouldbechosento comeascloseto exactlymodelingthedesiredscenarioasis possible.Inthecaseof game-treesearching,thecasefor simulationsis weak.Thereis no needtodo simulationswhentherearequality game-playingprogramsavailablefor obtainingactualdata. Further, as this paperhasdemonstrated,simulationparameterscanbeincorrect,resultingin largeerrorsin theresultsthatleadto misleadingconclusions.Inparticular, the failure to includeiterativedeepening,transpositiontables,andalmostperfectlyordered treesin asimulationareseriousomissions.

With easyto understandMT-basedalgorithmsout-performingNegaScout,thisleadsto theobviousquestion:Why areyoustill usingalpha-beta?

20No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 25: MTD-f

85

90

95

100

105

110

115

2 4 6 8 10 12 14 16Depth

Chinook, Average over 20 TreesAspNS

ABMTD-f

MTD-best

85

90

95

100

105

110

115

2 3 4 5 6 7 8 9 10Depth

Keyano, Average over 20 Trees

AspNSAB

MTD-fMTD-best

85

90

95

100

105

110

115

2 3 4 5 6 7 8Depth

Phoenix, Average over 20 TreesAspNS

ABMTD-f

MTD-best

Figure13: LeafnodecountMTD-best

21No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 26: MTD-f

Acknowledgements

Thiswork hasbene®tedfrom discussionswith Mark Brockington(authorof Keyano),Yngvi BjornssonandAndreasJunghanns.The ®nancialsupportof the NetherlandsOrganizationfor Scienti®cResearch(NWO), the NaturalSciencesandEngineeringResearchCouncilof Canada(grantOGP-5183)andtheUniversityof AlbertaCentralResearchFundaregratefullyacknowledged.

References

[1] HansJ. Berliner. The B* treesearchalgorithm: A best-®rstproof procedure.Arti®cialIntelligence, 12:23±40,1979.

[2] Murray S.CampbellandT. A. Marsland.A comparisonof minimaxtreesearchalgorithms.Arti®cialIntelligence, 20:347±367,1983.

[3] Arie de Bruin, Wim Pijls, andAske Plaat. Solutiontreesasa basisfor gametree search. TechnicalReportEUR-CS-94-04,Departmentof ComputerSci-ence,ErasmusUniversityRotterdam,P.O. Box 1738,3000DR Rotterdam,TheNetherlands,May 1994.

[4] CarlEbeling.All theRightMoves. MIT Press,Cambridge,Massachusetts,1987.

[5] RainerFeldmann.Spielbaumsuchemit massivparallelenSystemen. PhDthesis,UniversitÈat-Gesamthochschule Paderborn,May 1993.

[6] ToshihideIbaraki. Generalizationof alpha-betaand SSS* searchprocedures.Arti®cialIntelligence, 29:73±117,1986.

[7] HermannKaindl,RezaShams,andHelmutHoracek.Minimaxsearchalgorithmswith andwithout aspirationwindows. IEEE Transactionson PatternAnalysisandMachineIntelligence, PAMI-13(12):1225±1235,December1991.

[8] Donald E. Knuth and RonaldW. Moore. An analysisof alpha-betapruning.Arti®cialIntelligence, 6(4):293±326,1975.

[9] T. A. Marsland,AlexanderReinefeld,andJonathanSchaeffer. Low overheadalternativesto SSS*.Arti®cialIntelligence, 31:185±199,1987.

[10] T. AnthonyMarsland.A reviewof game-treepruning.ICCAJournal, 9(1):3±19,March1986.

[11] T. AnthonyMarslandandAlexanderReinefeld.Heuristicsearchin oneandtwoplayer games. Technicalreport, University of Alberta, PaderbornCenterforParallelComputing,February1993.Submittedfor publication.

[12] Agata Muszyckaand Rajjan Shinghal. An empirical comparisonof pruningstrategiesin gametrees. IEEE Transactionson Systems,Man andCybernetics,15(3):389±399,May/June1985.

[13] JudeaPearl. Asymptotical propertiesof minimax treesand gamesearchingprocedures.Arti®cialIntelligence, 14(2):113±138,1980.

22No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html

Page 27: MTD-f

[14] JudeaPearl. Thesolutionfor thebranchingfactorof thealpha-betapruningal-gorithmandits optimality. Communicationsof theACM, 25(8):559±564,August1982.

[15] JudeaPearl. Heuristics± Intelligent Search Strategiesfor ComputerProblemSolving. Addison-WesleyPublishingCo.,Reading,MA, 1984.

[16] Wim Pijls andArie deBruin. Searchinginformedgametrees.TechnicalReportEUR-CS-92-02,ErasmusUniversityRotterdam,Rotterdam,NL, October1992.Extendedabstractin ProceedingsCSN 92, pp. 246±256,and Algorithms andComputation,ISAAC 92(T. Ibaraki,ed),pp.332±341,LNCS650.

[17] AskePlaat,JonathanSchaeffer, Wim Pijls, andArie deBruin. Popularmiscon-ceptionsaboutSSS*in practice.TechnicalReportTR-CS-94-17,DepartmentofComputingScience,Universityof Alberta,Edmonton,AB, Canada,December1994.

[18] AlexanderReinefeld.An improvementof theScouttree-searchalgorithm.ICCAJournal, 6(4):4±14,1983.

[19] Alexander Reinefeld. Spielbaum Suchverfahren. volume Informatik-Fachberichte200.SpringerVerlag,1989.

[20] Igor RoizenandJudeaPearl.A minimaxalgorithmbetterthanalpha-beta?Yesandno. Arti®cialIntelligence, 21:199±230,1983.

[21] JonathanSchaeffer. Experimentsin Search andKnowledge. PhDthesis,Depart-mentof ComputingScience,Universityof Waterloo,Canada,1986.AvailableasUniversityof AlbertatechnicalreportTR86-12.

[22] JonathanSchaeffer. The history heuristicandalpha-betasearchenhancementsin practice. IEEE Transactionson PatternAnalysisand MachineIntelligence,PAMI-11(1):1203±1212,November1989.

[23] JonathanSchaeffer, JosephCulberson,NormanTreloar, BrentKnight, PaulLu,andDuaneSzafron.A world championshipcalibercheckersprogram.Arti®cialIntelligence, 53(2-3):273±290,1992.

[24] D.J.SlateandL.R.Atkin. Chess4.5- theNorthwesternUniversitychessprogram.In P.W. Frey, editor, ChessSkill in ManandMachine, pages82±118,NewYork,1977.Springer-Verlag.

[25] G. C. Stockman.A minimaxalgorithmbetterthanalpha-beta?Arti®cialIntelli-gence, 12(2):179±196,1979.

[26] Jean-ChristopheWeill. The NegaC*search. ICCA Journal, 15(1):3±7,March1992.

23No license: PDF produced by PStill (c) F. Siegert - http://www.this.net/~frank/pstill.html