39
Agents that can play multi-player games

Agents that can play multi-player games

Embed Size (px)

DESCRIPTION

Agents that can play multi-player games. Recall: Single-player, fully-observable, deterministic game agents. An agent that plays Peg Solitaire involves A representation of the initial state; A method to generate new states from existing ones; A test for whether a state is a goal state. - PowerPoint PPT Presentation

Citation preview

Page 1: Agents that can play  multi-player games

Agents that can play multi-player games

Page 2: Agents that can play  multi-player games

Recall: Single-player, fully-observable, deterministic game agents

An agent that plays Peg Solitaire involves - A representation of

the initial state;- A method to generate

new states from existing ones;

- A test for whether a state is a goal state.

Initial Board for Triangle Peg Solitaire

A jump, with resulting board

The goal state:

Page 3: Agents that can play  multi-player games

Recall: Single-player, fully-observable, deterministic game agents

Initial Board for Triangle Peg Solitaire

A jump, with resulting board

The goal state:

Initial state

Successor state axioms or STRIPS effects

Goal state

Page 4: Agents that can play  multi-player games

Recall: Single-player, fully-observable, deterministic game agents

Initial Board for Triangle Peg Solitaire

A jump, with resulting board

The goal state:

Initial state

Successor state axioms or STRIPS effects

Goal state

Page 5: Agents that can play  multi-player games

Goal state vs. Terminal states and Utilities

The goal state:

terminal statesUtility: +2

Utility: +1

Utility: -1

Page 6: Agents that can play  multi-player games

Quiz: Goal states vs. Terminal states and Utilities

Initial state

Successor state axioms or STRIPS effects

Terminal states

What could go wrong when using A* or breadth-first or other strategies with terminal states?

+1+2

-1

Page 7: Agents that can play  multi-player games

Answer: Goal states vs. Terminal states and Utilities

Initial state

Successor state axioms or STRIPS effects

Terminal states

You’re guaranteed to find the best path to the terminal state that is found.

You’re NOT guaranteed to find the best terminal state (the one with highest utility), unless you do an exhaustive search.

+1+2

-1

Page 8: Agents that can play  multi-player games

Hex: Two-player, zero-sum game

(Also, deterministic and fully-observable.)Hex:- Two players, red and blue.- Board is N x N, with hexagonal

spaces.- Two opposite sides are red, and

other two sides are blue.- Each player’s objective is to build

a path connecting the sides of his or her color.

- Players alternate turns, and place a single piece of their color on their turn.

Page 9: Agents that can play  multi-player games

Hex: Two-player, zero-sum gameSome fun facts:- There are no ties in Hex (proved

by John Nash).- First player has a distinct

advantage (also proved by Nash).- In tournament play, it’s common

to use the “pie rule”, for fairness: after the first player makes the first move, the second player can choose whether to switch sides. (We will ignore this rule.)

Page 10: Agents that can play  multi-player games

Hex Question

What is red’s best move (red’s turn next)?

Page 11: Agents that can play  multi-player games

Hex Question

What is red’s best move (red’s turn next)?This orange one looks pretty good: only one more square, and red will win.

Using a simple heuristic, this looks like it’s getting close to the goal.

Page 12: Agents that can play  multi-player games

Hex Question

What is red’s best move (red’s turn next)?However, if red moves to the orange square, the blue player can win on the next turn!

Page 13: Agents that can play  multi-player games

Quiz: Hex Question

If red moves to the orange square, what is blue’s best move?

Page 14: Agents that can play  multi-player games

Answer: Hex Question

Blue has no good moves left!

Page 15: Agents that can play  multi-player games

Answer: Hex Question

Blue has no good moves left!This one’s bad – red can still connect the paths.

Page 16: Agents that can play  multi-player games

Answer: Hex Question

Blue has no good moves left!And this one’s bad too – red can still connect the paths.

Page 17: Agents that can play  multi-player games

Reasoning about 2-player games

To pick a good move, each player has to think about the other player’s possible responses!

Page 18: Agents that can play  multi-player games

Extensive Form Representation of Games

Notation: - two players, Max (Δ) and Min (∇).- Terminal states are represented by a with a

number for the utility for Max (Δ) inside.(Since we’re doing zero-sum games, the utility for Min (∇) is just the opposite of this number.)

Page 19: Agents that can play  multi-player games

Extensive Form Representation of Games

Game tree:

Max’s turn

Resulting worlds/boards

+1+2

-1

∆∇ ∇ ∇

∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ ∆ Max’s turn

…Terminal states,

with utility for Max

Max’s possible actions

Min’s turn

Resulting worlds/boards

Min’s possible actions

Page 20: Agents that can play  multi-player games

Minimax (Backup) AlgorithmBasic Idea:Compute ∆’s Value(n) for each node n in the game tree, starting with the leaves and working up (“backup”).

We’ll use a depth-first tree traversal.

Once this is calculated, Max will choose an action that leads to a child node with the highest possible value.

8 121

∆∇ ∇ ∇

4 43 20 152

Page 21: Agents that can play  multi-player games

Minimax (Backup) AlgorithmValue(n) =- If n is a terminal node, Value(n) = ∆’s

utility- If n is ∆’s turn:

- If n is ’s turn:∇

8 121

∆∇ ∇ ∇

4 43 20 152

Page 22: Agents that can play  multi-player games

Minimax (Backup) AlgorithmValue(n) =- If n is a terminal node, Value(n) =

Max’s utility- If n is ∆’s turn:

- If n is ’s turn:∇

8 121

∆∇ ∇ ∇

4 43 20 152

Value: min {3, 4, 4} = 3

Value: min {2, 30, 15} = 2

Page 23: Agents that can play  multi-player games

Quiz: Minimax (Backup) AlgorithmValue(n) =- If n is a terminal node,

Value(n) = Max’s utility- If n is ∆’s turn:

- If n is ’s turn:∇

8 121

∆∇ ∇ ∇

4 43 20 152

Value: min {3, 4, 4} = 3

Value: min {2, 30, 15} = 2

1. What is the Value of the middle node?∇

2. What is the value of the top ∆ node?

Page 24: Agents that can play  multi-player games

Answer: Minimax (Backup) Algorithm

Value(n) =- If n is a terminal node,

Value(n) = Max’s utility- If n is ∆’s turn:

- If n is ’s turn:∇

8 121

∆∇ ∇ ∇

4 43 20 152

1. What is the Value of the middle node?∇min {1, 8, 12} = 1

2. What is the value of the top ∆ node?Max {3, 1, 2} = 3

Page 25: Agents that can play  multi-player games

Quiz: Minimax

1. Compute the value of each node in the game tree.

2. Which action should Max take?

3. What is Min’s optimal response?

4

121

∆∇ ∇ ∇

4

56

20

-92 15301079

∆ ∆ ∆ ∆ ∆a b c

Page 26: Agents that can play  multi-player games

Answer: Minimax

1. Compute the value of each node in the game tree.

2. Which action should Max take? Action on right (c)

3. What is Min’s optimal response? Action on right

4

121

∆∇ ∇ ∇

4

56

20

-92 15301079

∆ ∆ ∆ ∆ ∆6 7 10 30 15

4 1 15

15a b c

Page 27: Agents that can play  multi-player games

From Extensive Form toNormal Form Games

Every “extensive form” game (even ones where you don’t have zero-sum utilities) can be made into a “normal form” game.

4

1

∆∇ ∇

4

5 107

∆ ∆A B

C D C D

A B A B

C D

A, A +4, -4 +5, -5

A, B +4, -4 +7, -7

B, A +1, -1 +4, -4

B, B +1, -1 +10, -10

Each sequence of actions for a player becomes a row or a column.The size of the resulting matrix can be exponential in the size of the game tree.

Page 28: Agents that can play  multi-player games

From Normal Form games toExtensive Form games

Not every Normal Form game can be represented using the Extensive Form I have showed you so far.

C D

C +2, -2 -3, +3

D -3, +3 +4, -4

-3

∆∇ ∇

2

C D

C D C D

-3 4

-3

∇∆ ∆

2

C D

C D C D

-3 4

?

?

∇∆

Page 29: Agents that can play  multi-player games

From Normal Form games toExtensive Form games

Can introduce new notation – information states – that allows the Extensive Form to represent any Normal Form game.

C D

C +2, -2 -3, +3

D -3, +3 +4, -4

-3

∆∇ ∇

2

C D

C D C D

-3 4

-3

∇∆ ∆

2

C D

C D C D

-3 4

∇∆

Page 30: Agents that can play  multi-player games

From Normal Form games toExtensive Form games

Information states are also useful for handling Partial Observability in turn-based games.Eg, in Poker, they can be used to represent the set of all hands your opponent may have been dealt.

C D

C +2, -2 -3, +3

D -3, +3 +4, -4

-3

∆∇ ∇

2

C D

C D C D

-3 4

-3

∇∆ ∆

2

C D

C D C D

-3 4

∇∆

Page 31: Agents that can play  multi-player games

Perfect Information Games

Definition: A game in extensive form has perfect information if every information state has only one node. (This is the same as our original version of game trees.)

Perfect Information is basically just another name for full observability for game trees.

We’ll talk more about partial observability later.

Theorem (Zermelo, 1913): Every finite, perfect-information game in extensive form has a pure-strategy Nash equilibrium.

Page 32: Agents that can play  multi-player games

Relation between Minimax Algorithm and Minimax Theorem

Recall that the Minimax Theorem says every 2-player, zero-sum game has a Value for each player and a Nash Equilibrium.

The guy who proved this (von Neumann) used essentially the Minimax algorithm to prove the theorem.

The Value of the root node in the Minimax algorithm is the same as the Value of the game for the Max player.

Page 33: Agents that can play  multi-player games

Quiz: Time Complexity of Minimax

Let b be the branching factor of the game tree.

Let m be the depth of the game tree.

What is the time complexity of Minimax?O(b+m)?O(bm)?O(bm)?O(mb)?

4

121

∆∇ ∇ ∇

4

56

20

-92 15301079

∆ ∆ ∆ ∆ ∆

Page 34: Agents that can play  multi-player games

Answer: Time Complexity of Minimax

Let b be the branching factor of the game tree.

Let m be the depth of the game tree.

What is the time complexity of Minimax?O(b+m)?O(bm)?O(bm)O(mb)?

4

121

∆∇ ∇ ∇

4

56

20

-92 15301079

∆ ∆ ∆ ∆ ∆

Page 35: Agents that can play  multi-player games

Quiz: Space Complexity of Minimax

Let b be the branching factor of the game tree.

Let m be the depth of the game tree.

What is the space complexity of Minimax?O(b+m)?O(bm)?O(bm)?O(mb)?

4

121

∆∇ ∇ ∇

4

56

20

-92 15301079

∆ ∆ ∆ ∆ ∆

Page 36: Agents that can play  multi-player games

Answer: Space Complexity of Minimax

Let b be the branching factor of the game tree.

Let m be the depth of the game tree.

What is the space complexity of Minimax?O(b+m)?O(bm)O(bm)?O(mb)?

4

121

∆∇ ∇ ∇

4

56

20

-92 15301079

∆ ∆ ∆ ∆ ∆

Page 37: Agents that can play  multi-player games

Quiz: Complexity of MinimaxChess: has an average branching factor of ~30, and each game takes on average ~40.

If it takes ~1 milli-second to compute the value of each board position in the game tree, how long to figure out the value of the game using Minimax?A few millisecondsA few secondsA few minutesA few hoursA few daysA few years?A few decades?A few millenia (thousands of years)?More time than the age of the universe?

Page 38: Agents that can play  multi-player games

Quiz: Complexity of MinimaxChess: has an average branching factor of ~30, and each game takes on average ~40.

If it takes ~1 milli-second to compute the value of each board position in the game tree, how long to figure out the value of the game using Minimax?A few millisecondsA few secondsA few minutesA few hoursA few daysA few years?A few decades?A few millenia (thousands of years)?More time than the age of the universe

Page 39: Agents that can play  multi-player games

Strategies for coping with complexity

• Reduce b• Reduce m• Memoize