1
Player 5 Player 4 Player 3 Player 2 Player 1 nearby terrain 8x8 grid of height, traversability, creep occupancy for each hero in team Ability N Ability N Ability N Pickup 1 Embedding Pickup Type distance from current player, present/missing concat FC-relu FC-relu max-pool Unit N Unit 2 Unit 1 Heroes only LSTM 2048 units Unit Type distance from current player orientation absolute position animation health over last 12 frames unit stats (health, regen, attack,…) Embedding Embedding cos sin Ability N Ability N Ability N Ability 1 Embedding Ability type Stats (cooldown, etc.) concat FC-relu FC-relu max-pool Ability N Ability N Ability N Item 1 Embedding Item type Stats (charges, etc.) concat FC-relu FC-relu max-pool Ability N Ability N Ability N Modifier 1 Embedding Modifier type Stats (duration, etc.) concat FC-relu FC-relu max-pool concat Current player is attacking/ attacked by this unit FC-relu FC-relu FC-relu FC Enemy Heroes Allied Heroes Enemy Non-Heroes Allied Non-Heroes Neutrals concat max-pool slice 0:128 FC-relu FC max-pool slice 0:128 FC-relu FC max-pool slice 0:128 FC-relu FC max-pool slice 0:128 FC-relu FC max-pool slice 0:128 concat •Allied & enemy glyph cooldown •is Night •time until creepwave •time since enemy courier last seen •time until night •courier number of flask, clarity, enchanted mangoes, town portals, magic sticks •Total value courier items FC-relu FC Available Actions Embedding dot Softmax Selected Action Sample/Argmax FC Softmax Offset X Sample/Argmax FC Softmax Offset Y Sample/Argmax FC Softmax Move X Sample/Argmax FC Softmax Move Y Sample/Argmax FC Softmax Teleport Destination Sample/Argmax FC Softmax Delay Sample/Argmax dot FC Unit Attention Keys Softmax Target Unit Sample/Argmax •Buyback Cost, Cooldown •Number of deaths •Ability is active, used, or phased •Teleport destination, time, ongoing •Team •Level, Max Mana, Magic resist, Agility, Intelligence, etc. OpenAI Five Model Architecture (08/06/2018) FC Softmax Ward X Sample/Argmax FC Softmax Ward Y Sample/Argmax FC Win Probability Embedding Selected Action sigmoid slice 0:512 slice 512:2048 max-pool across Players concat sigmoid 10 x 10 Minimap, with image channels for: •Vision •Walkability •Creep location •Ward locations convolution-relu stride=1, filters=32, width=2 max-pool-2d convolution-relu stride=1, filters=32, width=2 Player Hero Embedding

OpenAI Five Model Architecture - Amazon S3

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Player 5Player 4

Player 3Player 2

Player 1

nearby terrain 8x8 grid of height, traversability,

creep occupancy for each hero in team

Ability NAbility NAbility NPickup 1

Embedding

Pickup Type distance from current player, present/missing

concat

FC-relu

FC-relu

max-pool

Unit N…

Unit 2Unit 1

Heroes only

LSTM2048 units

Unit Type distance from

current player

orientation absolute position animationhealth over

last 12 framesunit stats (health, regen, attack,…)

Embedding Embeddingcos sin

Ability NAbility NAbility NAbility 1

Embedding

Ability type Stats (cooldown, etc.)

concat

FC-relu

FC-relu

max-pool

Ability NAbility NAbility NItem 1

Embedding

Item type Stats (charges, etc.)

concat

FC-relu

FC-relu

max-pool

Ability NAbility NAbility NModifier 1

Embedding

Modifier type Stats (duration, etc.)

concat

FC-relu

FC-relu

max-pool

concat

Current player is attacking/

attacked by this unit

FC-relu

FC-relu

FC-relu

FC

Enemy HeroesAllied Heroes

Enemy Non-Heroes

Allied Non-Heroes Neutrals

concat

max-pool slice0:128

FC-relu

FC

max-pool slice0:128

FC-relu

FC

max-pool slice0:128

FC-relu

FC

max-pool slice0:128

FC-relu

FC

max-pool slice0:128

concat

•Allied & enemy glyph cooldown•is Night•time until creepwave•time since enemy courier last seen

•time until night•courier number of flask, clarity, enchanted mangoes, town portals, magic sticks

•Total value courier items

FC-relu

FC

Available Actions

Embedding

dot Softmax Selected ActionSample/Argmax

FC Softmax Offset XSample/Argmax

FC Softmax Offset YSample/Argmax

FC Softmax Move XSample/Argmax

FC Softmax Move YSample/Argmax

FC Softmax Teleport DestinationSample/Argmax

FC Softmax DelaySample/Argmax

dotFC

UnitAttentionKeys

Softmax Target UnitSample/Argmax

•Buyback Cost, Cooldown•Number of deaths•Ability is active, used, or phased•Teleport destination, time, ongoing•Team•Level, Max Mana, Magic resist, Agility, Intelligence, etc.

OpenAI Five Model Architecture (08/06/2018)

FC Softmax Ward XSample/Argmax

FC Softmax Ward YSample/Argmax

FC Win Probability

Embedding

Selected Action

sigmoid

slice0:512 slice512:2048

max-poolacrossPlayers

concat

sigmoid

10 x 10 Minimap, with image channels

for:•Vision•Walkability•Creep location•Ward locations

convolution-relu stride=1, filters=32,

width=2

max-pool-2d

convolution-relu stride=1, filters=32,

width=2

Player Hero

Embedding