34
CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou Ng, aima.eecs.berkeley.edu/slides-ppt, which are in turn based on Russell, aima.eecs.berkeley.edu/slides-pdf. Diagrams are based on AIMA.

CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

Embed Size (px)

Citation preview

Page 1: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

CSC 8520 Spring 2010. Paula Matuszek

CS 8520: Artificial Intelligence

Solving Problems by Searching

Paula Matuszek

Spring, 2010

Slides based on Hwee Tou Ng, aima.eecs.berkeley.edu/slides-ppt, which are in turn based on Russell, aima.eecs.berkeley.edu/slides-pdf. Diagrams are based on AIMA.

Page 2: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

2CSC 8520 Spring 2010. Paula Matuszek

Search• The basic concept of search views the state space as

a search tree– Initial state is the root node

– Each possible action leads to a new node defined by the transition model

– Some nodes are identified as goals

• Search is the process of expanding some portion of the tree in some order until we get to a goal node

• The strategy we use to choose the order to expand nodes defines the type of search

Page 3: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

3CSC 8520 Spring 2010. Paula Matuszek

Search in an Adversarial Environment

• Iterative deepening and A* useful for single-agent search problems

• What if there are TWO agents?

• Goals in conflict:– Adversarial Search

• Especially common in AI:– Goals in direct conflict– IE: GAMES.

Page 4: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

4CSC 8520 Spring 2010. Paula Matuszek

Games vs. search problems

• "Unpredictable" opponent specifying a move for every possible opponent reply

• Time limits unlikely to find goal, must approximate

• Efficiency matters a lot

• HARD.

• In AI, typically "zero sum": one player wins exactly as much as other player loses.

Page 5: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

5CSC 8520 Spring 2010. Paula Matuszek

Types of games Deterministic Stochastic

Perfect Info Chess Monopoly

Checkers Backgammon

Othello

Tic-Tac-Toe

Imperfect Info Battleship Bridge

Poker

Scrabble

Page 6: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

6CSC 8520 Spring 2010. Paula Matuszek

Tic-Tac-Toe

• Tic Tac Toe is one of the classic AI examples. Let's play some.

• A Tic Tac Toe game:– http://ostermiller.org/calc/tictactoe.html

• Try it, at various levels of difficulty.– What kind of strategy are you using?– What kind does the computer seem to be using?– Did you win? Lose?

Page 7: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

7CSC 8520 Spring 2010. Paula Matuszek

Problem Definition• Formally define a two-person game as:

• Two players, called MAX and MIN.– Alternate moves

– At end of game winner is rewarded and loser penalized.

• Game has– Initial State: board position and player to go first

– Successor Function: returns (move, state) pairs• All legal moves from the current state

• Resulting state

– Terminal Test

– Utility function for terminal states.

• Initial state plus legal moves define game tree.

Page 8: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

8CSC 8520 Spring 2010. Paula Matuszek

Tic Tac Toe Game tree

Page 9: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

9CSC 8520 Spring 2010. Paula Matuszek

Optimal Strategies• Optimal strategy is sequence of moves

leading to desired goal state.

• MAX's strategy is affected by MIN's play.

• So MAX needs a strategy which is the best possible payoff, assuming optimal play on MIN's part.

• Determined by looking at MINIMAX value for each node in game tree.

Page 10: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

10

CSC 8520 Spring 2010. Paula Matuszek

Minimax• Perfect play for deterministic games

• Idea: choose move to position with highest minimax value

= best achievable payoff against best play

• E.g., 2-ply game:

Page 11: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

11

CSC 8520 Spring 2010. Paula Matuszek

Minimax algorithm

Page 12: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

12

CSC 8520 Spring 2010. Paula Matuszek

Properties of minimax• Complete? Yes (if tree is finite)

• Optimal? Yes (against an optimal opponent)

• Time complexity? O(bm)

• Space complexity? O(bm) (depth-first exploration)

• For chess, b ≈ 35, m ≈100 for "reasonable" games exact solution completely infeasible

• Even tic-tac-toe is much too complex to diagram here, although it's small enough to implement.

Page 13: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

13

CSC 8520 Spring 2010. Paula Matuszek

Pruning the Search• “If you have an idea that is surely bad, don't

take the time to see how truly awful it is.” -- Pat Winston

• Minimax exponential with # of moves; not feasible in real-life

• But we can PRUNE some branches.

• Alpha-Beta pruning– If it is clear that a branch can't improve on the

value we already have, stop analysis.

Page 14: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

14

CSC 8520 Spring 2010. Paula Matuszek

α-β pruning example

Page 15: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

15

CSC 8520 Spring 2010. Paula Matuszek

α-β pruning example

Page 16: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

16

CSC 8520 Spring 2010. Paula Matuszek

α-β pruning example

Page 17: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

17

CSC 8520 Spring 2010. Paula Matuszek

α-β pruning example

Page 18: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

18

CSC 8520 Spring 2010. Paula Matuszek

α-β pruning example

Page 19: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

19

CSC 8520 Spring 2010. Paula Matuszek

Properties of α-β• Pruning does not affect final result

• Good move ordering improves effectiveness of pruning

• With "perfect ordering," time complexity = O(bm/2) doubles depth of search which can be carried out for a given level

of resources.

• A simple example of the value of reasoning about which computations are relevant (a form of metareasoning)

Page 20: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

20

CSC 8520 Spring 2010. Paula Matuszek

Why is it called α-β?• α is the value of the

best (i.e., highest-value) choice found so far at any choice point along the path for max

• If v is worse than α, max will avoid it prune that branch

• Define β similarly for min

Page 21: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

21

CSC 8520 Spring 2010. Paula Matuszek

The α-β algorithm

Page 22: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

22

CSC 8520 Spring 2010. Paula Matuszek

The α-β algorithm

Page 23: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

23

CSC 8520 Spring 2010. Paula Matuszek

"Informed" Search• Alpha-Beta still not feasible for large game

spaces.

• Can we improve on performance with domain knowledge?

• Yes -- if we have a useful heuristic for evaluating game states.

• Conceptually analogous to A* for single-agent search.

Page 24: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

24

CSC 8520 Spring 2010. Paula Matuszek

Resource limitsSuppose we have 100 secs, explore 104 nodes/sec

106 nodes per move

Standard approach:

• cutoff test: e.g., depth limit (perhaps add quiescence search)

• evaluation function = estimated desirability of position

Page 25: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

25

CSC 8520 Spring 2010. Paula Matuszek

Evaluation function• Evaluation function or static evaluator is used to evaluate the

“goodness” of a game position.– Contrast with heuristic search where the evaluation function was a non-

negative estimate of the cost from the start node to a goal and passing through the given node

• The zero-sum assumption allows us to use a single evaluation function to describe the goodness of a board with respect to both players. – f(n) >> 0: position n good for me and bad for you

– f(n) << 0: position n bad for me and good for you

– f(n) near 0: position n is a neutral position

– f(n) = +infinity: win for me

– f(n) = -infinity: win for you

DesJardins: www.cs.umbc.edu/671/fall03/slides/c8-9_games.ppt

Page 26: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

26

CSC 8520 Spring 2010. Paula Matuszek

Evaluation function examples• Example of an evaluation function for Tic-Tac-Toe:

f(n) = [# of 3-lengths open for me] - [# of 3-lengths open for you]

where a 3-length is a complete row, column, or diagonal

• Alan Turing’s function for chess– f(n) = w(n)/b(n) where w(n) = sum of the point value of white’s pieces and

b(n) = sum of black’s

• Most evaluation functions are specified as a weighted sum of position features:f(n) = w1*feat1(n) + w2*feat2(n) + ... + wn*featk(n)

• Example features for chess are piece count, piece placement, squares controlled, etc.

• Deep Blue (which beat Gary Kasparov in 1997) had over 8000 features in its evaluation function

DesJardins: www.cs.umbc.edu/671/fall03/slides/c8-9_games.ppt

Page 27: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

27

CSC 8520 Spring 2010. Paula Matuszek

Cutting off searchMinimaxCutoff is identical to MinimaxValue except

1. Terminal? is replaced by Cutoff?2. Utility is replaced by Eval

Does it work in practice?For chess: bm = 106, b=35 m=4

4-ply lookahead is a hopeless chess player!– 4-ply ≈ human novice– 8-ply ≈ typical PC, human master– 12-ply ≈ Deep Blue, Kasparov

Page 28: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

28

CSC 8520 Spring 2010. Paula Matuszek

Deterministic games in practice• Checkers: Chinook ended 40-year-reign of human world champion

Marion Tinsley in 1994. Used a precomputed endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 444 billion positions. Now plays perfectly, using a combination of alpha-beta search and db of 39 trillion end positions.

• Chess: Deep Blue defeated human world champion Garry Kasparov in a six-game match in 1997. Deep Blue searched 200 million positions per second, used very sophisticated evaluation, and undisclosed methods for extending some lines of search up to 40 ply. Newer programs (Hydra and Rybka) may actually be better than any human player.

• Othello: human champions refuse to compete against computers, who are too good.

• Go: Just beginning to be good enough to play human champions. In go, b > 300, so most programs use pattern knowledge bases to suggest plausible moves, but MoGo also uses a Monte Carlo Tree Search, one of the first demonstrations of its value.

Page 29: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

29

CSC 8520 Spring 2010. Paula Matuszek

Non-Deterministic Games• Alpha-Beta search assumes that each player

plays perfectly and we know what the possible moves are.

• Suppose there is a chance element?

• In Backgammon, for instance, each player rolls dice, which determine the possible moves to be made.

Page 30: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

30

CSC 8520 Spring 2010. Paula Matuszek

Dealing with Chance• Add a “chance” layer of nodes between each

player’s move that captures the possible rolls and their probability

• Expand the minimax value to an “expected minimax value”: each node’s value is weighted by the probability of it occurring

• Therefore the “best” move is not the one whose evaluation is highest, but the one with a high evaluation which is also likely to happen.

Page 31: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

31

CSC 8520 Spring 2010. Paula Matuszek

Evaluation Functions for Chance• With a deterministic alpha-beta minimax is

concerned only with which value of our evaluation function is bigger; it is sufficient for our evaluation function to order the possibilities correctly.

• With an expectiminimax function, the size of the difference also matters; ideally, values indicate not only which move is better but how much better it is.

• Ideally, the evaluation function is a positive linear transformation of the probability of winning from that position.

Page 32: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

32

CSC 8520 Spring 2010. Paula Matuszek

Performance in Non-Deterministic Games

• We have expanded our search tree to include layers containing all possible dice rolls -- what is the performance hit?

• Basically, we now have O(bm bn) where n is the number of different dice rolls.

• Ouch!

• Can in some cases still find a cutoff; if the probability of a node is low enough we may not need to determine its value.– Requires that we know the bounds on the evaluation function.

• Monte Carlo alternative

Page 33: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

33

CSC 8520 Spring 2010. Paula Matuszek

Summary• Games are fun to work on!

• They illustrate several important points about AI

• perfection is unattainable must approximate

• good idea to think about what to think about

Page 34: CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Solving Problems by Searching Paula Matuszek Spring, 2010 Slides based on Hwee Tou

34

CSC 8520 Spring 2010. Paula Matuszek

Search Summary• For uninformed search, tradeoffs between time and

space complexity, with iterative deepening often the best choice.

• For non-adversarial informed search, A* usually the best choice; the better the heuristic, the better the performance.

• For adversarial search, minimax with alpha-beta pruning is optimal where feasible

• Adding an evaluation-function-based cutoff increases range of feasibility.

• The better we can capture domain knowledge in the heuristic and evaluation functions, the better we can do.