Review - Carnegie Mellon School of Computer Science1. Select K random pairs of parents ( X,X’) 2. For each pair of parents ( X,X’): 1.1 Generate offsprings (Y1,Y2) using crossover

1

Review

Uninformed Search

b

a

d

pq

h

e

c

f

r

START

GOAL

2

1

3

1

9

15

8

2

2

4

9

5

5

5

4

1

3

2

Complexity• N = Total number of states

• B = Average number of successors (branching factor)

• L = Length for start to goal with smallest number of steps

• Q = Average size of the priority queue

• Lmax = Length of longest path from START to any state

O(Min(N,2BL/2))O(Min(N,2BL/2))Y, If all trans.

have same cost

YBi- Direction.

BFS

BIBFS

O(BL)O(BL)Y, If all trans. have same cost

YIterative Deepening

IDS

O(Min(N,BLmax))O(Min(N,BLmax))NYMemorizing DFS

MEMDFS

O(BLmax)O(BLmax)NYPath Check DFS

PCDFS

O(Min(N,BL))O(log(Q)*Min(N,BL))Y, If cost > 0Y, If cost > 0

Uniform Cost Search

UCS

O(Min(N,BL))O(Min(N,BL))Y, If all trans.

have same cost

YBreadth First

Search

BFS

SpaceTimeOptimalCompleteAlgorithm

START

A

B

C

GOAL

h(A) = 3

h(B) = 6

Our best guess is that A is closer

to GOAL than B so maybe it is a

more promising state to expand

h(B) = 10

Informed Search

3

Informed Search

• Best-First Search: Expand node with minimum h(s)

• A*: Expand node with minimum f(s) = g(s) + h(s)

• Guaranteed to be optimal if h admissible h(s) <= h*(s)

• No nodes revisited if monotonic:

h(s) < h(s’) + cost(s,s’)

• IDA* = Equivalent of Iterative Deepening for A* � guarantees low memory usage

4

)()()2()( swhsgwsf +−=

Most Basic Algorithm: Hill-Climbing (Greedy Local Search)

• X Initial configuration

• Iterate:

1. E Eval(X)

2. N Neighbors(X)

3. For each Xi in N

Ei Eval(Xi)

4. If all Ei’s are lower than E

Return X

Else

i* = argmaxi (Ei) X Xi* E Ei*

5

Stochastic Search: Randomized Hill-Climbing

• X Initial configuration

• Iterate:

1. E Eval(X)

2. X’ one configuration

randomly selected in

Neighbors (X)

3. E’ Eval(X’)

4. If E’ > E

X X’

E E’

Critical change: We no

longer select the best move in the entire

neighborhood

Until when?

Simulated Annealing1. Do K times:

1.1 E Eval(X)

1.2 X’ one configuration randomly

selected in Neighbors (X)

1.3 E’ Eval(X’)

1.4 If E’ >= E

X X’; E E’;

Else accept the move with probability

p = e -(E – E’)/T :

X X’; E E’;

2. T α T

6

Basic GA Outline• Create initial population X = {X1,..,XP}

• Iterate:

1. Select K random pairs of parents (X,X’)

2. For each pair of parents (X,X’):

1.1 Generate offsprings (Y1,Y2) using crossover operation

1.2 For each offspring Yi:

Replace randomly selected element of the population by Yi

With probability µ:

Apply a random mutation to Yi

• Return the best individual in the population

7

CSP• Definitions

• Standard search

• Improvements– Backtracking

– Forward checking

– Constraint propagation

• Heuristics:– Variable ordering

– Value ordering

• Examples

• Tree-structured CSP

• Local search for CSP problems

Constraint Propagation• A = queue of active arcs (Vi,Vj)

• Repeat while A not empty:

– (Vi,Vj) next element of A

– For each x in D(Vi):

• Remove x from D(Vi) if there is no y in D(Vj) for which (x,y) satisfies the constraint between Vi and Vj.

– If D(Vi) has changed:

• Add all the pairs (Vk,Vi), where Vk is a neighbor of Vi (k not equal to j) to A

8

Order the variables such that the parent of a node

appears always before that node in the list

O(N d2)

• Most Constraining Variable– Selecting a variable which contributes to the largest number of

constraints will have the largest effect on the other variables �Hopefully will prune a larger part of the search

– This amounts to finding the variable that is connected to the largest number of variables in the constraint graph.

• Minimum Remaining Values (MRV)

– Selecting the variable that has the least number of candidate

values is most likely to cause a failure early (“fail-first” heuristic)

• Least Constraining Value– Choose the value which causes the smallest reduction in the number of

available values for the neighboring variables

9

• Configuration space C = set of values of q corresponding to

legal configurations of the robot• Defines the set of possible parameters (the search space) and

the set of allowed paths

• Grow the forbidden parts of C-Space � Can assume that

robot is a point in C-Space

Configuration Space (C-Space)

10

Visibility Graphs

• N = total number of vertices of the

obstacle polygons

• Naïve: O(N3)

• Sweep: O(N2 log N)

• Optimal: O(N2)

Voronoi Diagrams

• Key property: The points on the edges of the Voronoidiagram are the furthest from the obstacles• Idea: Construct a path between qstart and qgoal by following edges on the Voronoi diagram• (Use the Voronoi diagram as a roadmap graph instead

of the visibility graph)

11

Cell Decomposition

• Define cells in C-Space• Mark any cell of the grid that intersects Cobs as

blocked• Find path through remaining cells by using (for

example) A* (e.g., use Euclidean distance as heuristic)

• Approximate � Easy to compute but not complete• Exact � Hard to compute but complete

Potential Fields

• Stay away from obstacles: Imagine that the obstacles are made of a material that generate a repulsive field

• Move closer to the goal: Imagine that the goal location is a particle that generates an attractive field

• Key issue: Local minima– Deterministic exploration of local minima

– Stochastic exploration

• In very high dimension � Randomize traversal of neighbors

12

Sampling Techniques

• Tends to explore the space rapidly in all directions

• Does not require extensive pre-processing

• Single query/multiple query problems• Needs only collision detection test � No need to

represent/pre-compute the entire C-space

• For a large class of problems:– Prob(finding a path) � 1 exponentially with the number of samples

• But, cannot detect that a path does not exist

13

Games

14

A

B

3 12 8 2 14 5 24 6

3 =min(3,12,8)

2 2

3 = max(3,2,2)

MinimaxMinimax (s)

If s is terminal

Return U(s)

If next move is A

Return

Else

Return

( ))('

'MinimaxmaxsSuccss

s∈

( ))('

'MinimaxminsSuccss

s∈

15

Minimax Properties

• Complete: If finite game

• Optimal: If opponent plays optimally

• Complexity: Essentially DFS, so:– Time: O(Bm)

– Space: O(Bm)

– B = number of possible moves from any state (branching factor)

– m = depth of search (length of game)

• Pruning (αβ):– Guaranteed to find same solution

– O(Bm/2) with proper ordering of the nodes � At “A” node, the successor are in order from high to low score

Non-Deterministic MinimaxMinimax (s)

If s is terminal

Return U(s)

If next move is A: Return

If next move is B Return

If chance node Return

( ))('

'MinimaxmaxsSuccss

s∈

( ))('

'MinimaxminsSuccss

s∈

( ) ( )∑∈ )('

'Minimax'sSuccss

ssp

16

-1

+2

+1

+1

Min value across each row

Max value = game value = +2

),( jiMinMaxjColumnsiRowsM

+1+5+1+5IV

+1+5+1+5III

+2+2+4+4II

+2+2-1-1I

IVIIIIII

+5 +4 +5 +2

Ma

x v

alu

e a

cro

ss e

ach c

olu

mn

Min value = game value = +2

),( jiMaxMiniRowsjColumns

M

+1+5+1+5IV

+1+5+1+5III

+2+2+4+4II

+2+2-1-1I

IVIIIIII

Note that we find the same value and same

strategies in both cases. Is that always the case?

Minimax vs. Maximin

• Fundamental Theorem I (von Neumann):

– For a two-player, zero-sum game with

perfect information:

• There always exists an optimal pure strategy for each player

• Minimax = Maximin

• Note: This is a game-theoretic

formalization of the minimax search

algorithm that we studied earlier.

17

Minimax with Mixed Strategies• Theorem II (von Neumann):

– For a two-player, zero-sum game with hidden information:

• There always exists an optimal mixed strategy

• In addition, just like for games with perfect information, it does not matter in which order we look at the players, minimax is the same as maximin

• Note: This is a direct generalization of the minimax result to mixed strategies.

))1(,)1(max(min

))1(,)1(min(max

22211211

22122111

mqmqmqmq

mpmpmpmp

q

p

×−+××−+×

=×−+××−+×

p = 0.5 p = 0.5

hold

hold

see

resi

gn

resign

see

resi

gn

-10+4 -20 +4 +16

18

p = 0.5 p = 0.5

hold

hold

see

resi

gn

resign

see

resi

gn

-10+4 -20 +4 +16

Hidden information: Player B cannot know which of these 2 states it’s in

The game is

non-

deterministic

because of

the initial random

choice of

cards

• Generate the matrix form of the game (be careful: It’s not a deterministic game)

• Find the optimal mixed strategy

• Find the expected payoff for Player A

Hold

Resign

SeeResign

Player B

Pla

ye

r A

19

4,52,33,13,8

6,36,28,42,3

9,09,75,85,3

5,65,94,13,0I

II

III

IV

Strategies

Pure Strategy Nash Equilibrium

• Does not always exist

• Is not always unique

• For continuous games:

( ) ( )**

1

**

1

*

1

**

1

*

1

*

1 ,,,,,,,,,,,,niiiiniiii

sssssusssssu LLLL +−+− ≤

( )**

1

*

1

*

1

* ,,,,,,maxargniiii

s

isssssus

i

LL +−=

( ) 0,, **

1 =∂

∂n

i

i sss

uL

20

Mixed Strategy Nash Equilibrium

• Player A plays strategy si with probability pi

• For non-zero games � Mixed Nash

equilibrium always exists

• Is not always unique

( ) ( )*** ,, qpuqpu AA ≤

( )** ,maxarg qpup Ap

=

+1,+20,0Movie

0,0+2,+1Hockey

MovieHockey

+1,00,+1Movie

0,+2+2,0Hockey

MovieHockey

tB=meet

tB=avoid

P(tA=meet | tB=meet) = 1

P(tA=meet | tB=avoid) = 1

P(tA=avoid | tB=meet) = 0

P(tA=avoid | tB=avoid) = 0

P(tB=meet | tA=meet) = 1/2

P(tB=meet | tA=avoid) = 1/2

P(tB=avoid | tA=meet) = 1/2

P(tB=avoid | tA=avoid) = 1/2

21

( ) )|P()(),(

BPlayer of typespossible all

AB

t

BBAAAA tttstsuu

B

∑=

Payoff if Player A knows that Player B is of type tB

Probability that Player B is indeed of type tB

Since Player A does not know Player B’s type, it has to sum over all possible

types to get the expected value

22

Documents

Review - Carnegie Mellon School of Computer Science1. Select K random pairs of parents ( X,X’) 2. For each pair of parents ( X,X’): 1.1 Generate offsprings (Y1,Y2) using crossover