IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES 1
Multi-scale Bayesian modeling for RTS games:an application to StarCraft AI
Gabriel Synnaeve (email@example.com) Pierre Bessiere (firstname.lastname@example.org)
AbstractThis paper showcases the use of Bayesian modelsfor real-time strategy (RTS) games AI in three distinct core-components: micro-management (units control), tactics (armymoves and positions), and strategy (economy, technology, produc-tion, army types). The strength of having end-to-end probabilisticmodels is that distributions on specific variables can be used tointer-connect different models at different levels of abstraction.We applied this modeling to StarCraft, and evaluated eachmodel independently. Along the way, we produced and releaseda comprehensive dataset for RTS machine learning.
Index TermsBayesian modeling, RTS AI, real-time strategy,video games, StarCraft, tactics, micro-management
Research on video games rests in between research onreal-world robotics and research on simulations or theoreti-cal games. Indeed, artificial intelligences (AIs) evolve in asimulated world (no sensors and actuators problems) that isalso populated with human-controlled agents and/or other AIagents on which we often have no control. Thus, video-games constitutes a good middle-ground for experimentingwith robotic-inspired and cognitively-inspired techniques andmodels. Moreover, the gigantic complexity of RTS AI pushesresearchers to try different approaches than for strategic boardgames (Chess, Go...).
We will first show how the complexity of game AI (andparticularly RTS AI) is several order of magnitudes larger thanthose of board games. Thus, abstractions and simplificationsare necessary to work on the complete problem. We will thenexplain how building abstractions with Bayesian modeling isone possible framework to deal with game AIs complexityby dealing efficiently with uncertainty and abstraction. Then,we will successively present our three hierarchical abstractionlevels of interconnected models: micro-management, tactical,and strategic Bayesian models. We will see how to do reactiveunits control, and how to take objectives from a tactical model.Then we will show how to infer the opponents tactics usingknowledge of our strategic prediction. Finally, we will do adetailed analysis of an army composition model.
II. RTS AI PROBLEM
RTS is a sub-genre of strategy games where players needto build an economy (gathering resources and building abase) and military power (training units and researching
G. Synnaeve was (during this work) with the LSCP at ENS / EHESS /CNRS, Paris, France. He is now at Facebook AI Research, Paris, France.
P. Bessiere is with the CNRS/Sorbonne Univ./UPMC/ISIR, Paris, France.
technologies) in order to defeat their opponents (destroyingtheir army and base). From a theoretical point of view, themain differences between RTS games and traditional boardgames are that RTS have simultaneous moves (all playerscan move at the same time and as much units as wanted),durative actions (taking some time to complete), incompleteinformation (due to the fog-of-war: the player can only seethe dynamic state of the world/map where they have units),sometimes non-deterministic (only slightly for StarCraft), andthe players need to act in real-time (24 game frames persecond for StarCraft). As a metaphor, RTS games are likeplaying simultaneous moves Chess while playing the pianoto move pieces around the board. More information aboutStarCraft gameplay can be found in  and in pp.59-69 of.
Traditional (non-video) game AI takes roots in solvingboard strategy games. In those games, the complexity of thegame can be captured by the computational complexity of thetree search in a min-max like algorithm, which is defined bythe branching factor b and the depth d of the tree. For instancefor Chess : b 35, d 80. Table I gives an overview ofsuch a complexity (first column) for several games and gamegenres. For video games, we estimate the human difficultyfrom the players choices and actions (except for RTS for whichwe do both the strict computational complexity and the humandifficulty): b is the number of possible actions each time theplayer takes an action, and d/min is the average numberof (discrete, not counting mouse movements as continuoustrajectories) actions per minute (APM). Table I also showsa qualitative analysis of the amount of partial information,randomness, hierarchical continuity (how much an abstract de-cision constrains the players actions), and temporal continuity(how much previous actions constrain the next actions).
We operate two kinds of simplifications of this very complexproblem of full-game real-time strategy AI. On the one hand,we simplify decisions by taking into account their sequential-ity. We consider that a decision taken at a previous time t 1(softly) prunes the search of potential actions at time t, fora given level of reasoning (given level of abstraction). Thiscorresponds to doing a Markovian hypothesis in probabilisticmodeling. For instance, as shown (in red, left-to-right arrows)on Fig. 1, a tactical decision to attack from the front (F)is more likely followed by a hit-and-run (H) than an attackfrom the back (B) or a sneaky infiltration (I). On the otherhand, we decide of hierarchical levels of abstractions at which
IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES 2
TABLE I: Computational complexity of different game genres
quantization in increasing order: no, negligible, few, some, moderate, muchGame Combinatory Partial information Randomness Hierarchical continuity Temporal continuityChess b 35; d 80 no no some fewGo b 30 300; d 150 200 no no some moderateLimit Poker b 3; d/hour [20 . . . 240] much much moderate fewTime Racing b 50 1, 000; d/min 60+ no no much muchTeam FPS b 100 2, 000; d/min 100 some some some moderateFFPS duel b 200 5, 000; d/min 100 some negligible some muchMMORPG b 50 100; d/min 60 few moderate moderate muchRTS d/min(=APM) 300 much negligible moderate some
human b 200; d 7, 500full complexity b 3060 ; d 36, 000
Fig. 1: Sequential (horizontal, red) and hierarchical (vertical,blue) decision constraints. At the strategic level: A, D, C, Hrespectively stand for attack, defend, collect, hide ; while atthe tactical level: F, B, H, I respectively stand for front, back,hit-and-run, infiltrate. The squares correspond to actionable(low level) decisions, like moving a unit or making it attacka target.
we should take decisions that impact the level below, pruningthe hierarchical decisions according to what is possible. Forinstance, as shown (in blue, top-down arrows) on Fig. 1, if ourstrategic decision distribution is more in favor of attacking(A) instead of defending (D), collecting (C) or hiding (H),this constrains the subsequent tactics too. We will see thatthese levels of abstractions are easily recoverable from therules/structure of the game.
So, we decided to decompose RTS AI in the three lev-els which are used by the gamers to describe the game:strategy, tactics, micro-management. These levels are shownfrom left to right in the information-centric decompositionof our StarCraft bot in Fig. 2. Parts of the map not in thesight range of the players units are under fog-of-war, so theplayer has only partial information about the enemy buildings,technologies and army (units positions). The way by which weexpand the tech tree, the specific units composing the army,and the general stance (aggressive or defensive) constitutewhat we call strategy (left part of Fig. 2). At the lower level(bottom right in Fig. 2), the actions performed by the player(human or not) to optimize the effectiveness of its units iscalled micro-management. In between lies tactics: where to
Our TacticsOur Strategy Unit Groups
Production planner and managers
Opponent Tactics Opponent Positions
Our Style(+ meta)
buildings, technologies units
opening, tech tree attacks: where, how
wanted: units, buildings, tech
units, tech order how, where
Fig. 2: Information-centric view of the architecture of ourStarCraft bots major components. Arrows are labeled withthe information or orders they convey: dotted arrows are con-veying constraints, double lined arrows convey distributions,plain and simple arrows convey direct information or orders.The gray parts perform game actions (as the physical actionsof the player on the keyboard and mouse).
attack, and how. A good human player takes much data inconsideration when choosing: are there flaws in the defense?Which spot is more worthy to attack? How much am Ivulnerable for attacking here? Is the terrain (height, chokes) tomy advantage? The concept of strategy is a little more abstract:at the beginning of the game, it is closely tied to the buildorder and the intention of the first few moves and is calledthe opening, as in Chess. Then, the long term strategy canbe partially summed up by a few indicators: initiative (is theplayer leading or adapting) and the technology advancementvs. army production vs. economical growth distribution ofresources.
C. Bayesian programming
Probability is used as an alternative to classical logic andwe transform incompleteness (in the experiences, observationsor the model) into uncertainty . We now present Bayesianprograms (BP), a formalism that can be used to describeentirely any kind of Bayesian model, subsuming Bayesiannetworks and Bayesian maps, equivalent to probabilistic factorgraphs