Download ppt - N Course written by Richard E. Korf, UCLA. n The slides were made by students of this course from Bar-ilan University, Tel-Aviv, Israel

Course written by Richard E. Korf, UCLA. The slides were made by students of this course

from Bar-ilan University, Tel-Aviv, Israel.

ProblemsProblems

There are 3 general categories of problems in AI:

Single-agent pathfinding problems. Two-player games. Constraint satisfaction problems.

Single Agent Pathfinding ProblemsSingle Agent Pathfinding Problems

In these problems, in each case, we have a single problem-solver making the decisions, and the task is to find a sequence of primitive steps that take us from the initial location to the goal location.

Famous Example Domains15 puzzle

10^13 states First solved by [Korf 85] with IDA*

and Manhattan distance

24 puzzle 10^24 states First solved by [Korf 96]

1 2 34 5 6 78 9 1011

12131415

2021222324

1 2 3 45 6 7 8 9

10111213141516171819

Rubik’s cube 10^19 states First solved by [Korf 97]

Traveling Salesman Problem

Famous Example Domains

Two-Player GamesTwo-Player Games In a two-player game, one must consider the moves

of an opponent, and the ultimate goal is a strategy that will guarantee a win whenever possible.

Two-player perfect information have received the most attention of the researchers untill now.

But, nowadays, researchers are starting to consider more complex games, many of them involve an element of chance.

The best Chess, Checkers, and Othello players in the world are computer programs!

Constraint-Satisfaction ProblemsConstraint-Satisfaction Problems In these problems, we also have a single-agent

making all the decisions, but here we are not concerned with the sequence of steps required to reach the solution, but simply the solution itself.

The task is to identify a state of the problem, such that all the constraints of the problem are satisfied.

Famous Examples: Eight Queens Problem. Number Partitioning.

Problem Spaces Problem Spaces

A problem space consists of a set of statesstates of a problem and a set of operatorsoperators that change the state.

StateState : a symbolic structure that represents a single configuration of the problem in a sufficient detail to allow problem solving to proceed.

OperatorOperator : a function that takes a state and maps it to another state.

Not all operators are applicable to all states. The conditions that must be true in order for an operator to be legally applied to a state are known as the preconditionspreconditions of the operator.

Examples: 8-Puzzle:

statesstates: the different permutations of the tiles. operatorsoperators: moving the blank tile up, down, right

or left. Chess:

states:states: the different locations of the pieces on the board.

operatorsoperators: legal moves according to chess rules.


A problem instanceA problem instance: consists of a problem space, an initial state, and a set of goal states.

There may be a single goal state, or a set of goal states, anyone of which would satisfy the goal criteria. In addition, the goal could be stated explicitly or implicitly, by giving a rule of determining when the goal has been reached.

All 4 combinations are possible:


[single\set of goal state(s)] [explicit\implicit]. [single\set of goal state(s)] [explicit\implicit].

For Constraint Satisfaction Problems, the goal will always be represented implicitly, since an explicit description is the solution itself.

Example:

4-Queens has 2 different goal states. Here the goal is stated explicitly.


Q

QQ

QQ

QQ

QQ Q

QQ

QQ

QQ

QQ

Problem RepresentationProblem Representation

For some problems, the choice of a problem space is not so obvious.

The choice of representation for a problem can have an enormous impact on the efficiency of solving the problem.

There are no algorithms for problem representation. One general rule is that a smaller representation, in the sense of fewer states to search, is often better than a larger one.

For example, in the 8-Queens problem, when every state is an assignment of the 8 queens on the board: The number of possibilities with all 8 queens on the

board is 64 choose 8, which is overover 4 billion4 billion.

The solution of the problem prohibits more than one queen per row, so we may assign each queen to a separate row, now we’ll have 88 > 16 million16 million possibilities.

Same goes for not allowing 2 queens in the same column either, this reduces the space to 8!, which is only 40,32040,320 possibilities.

Problem RepresentationProblem Representation

Problem-Space GraphsProblem-Space Graphs

A Problem-Space Graph is a mathematical abstraction often used to represent a problem space:

The statesThe states are represented by nodes nodes of the graph.

The operatorsThe operators are represented by edges edges between nodes.

Edges may be undirected or directed.

Example: a small part of the 8-puzzle problem-space graph:

1 2 3

8 4

7 6 5

1 2 3

8 4

7 6 5

1 2 3

8 6 4

7 5

1 2 3

8 4

7 6 5

1 3

8 2 4

7 6 5

1 3

8 2 4

7 6 5

1 3

8 2 4

7 6 5

1 2

8 4 3

7 6 5

1 2 3

8 4 5

7 6

1 2 3

8 6 4

7 5

1 2 3

8 6 4

7 5

1 2 3

7 8 4

6 5

2 3

1 8 4

7 6 5


In most problems spaces there is more than one path between a pair of nodes.

Detecting when the same state has been regenerated via a different path requires saving all the previously generated states, and comparing newly generated states against the saved states.

Many search algorithms don’t detect when a state has previously been generated. The cost of this is that any state that can be reached by 2 different paths will be represented by duplicate nodes. The benefits are memory savings and simplicity.


The branching factor of a nodebranching factor of a node : is the number of children it has, not counting its parent if

the operator is reversible. is a function of the problem space.

The branching factor of a problem spacebranching factor of a problem space :

is the average number of children of the nodes in the space.

The solution depthsolution depth in a single-agent problem: is the length of the shortest path from the initial node to a

goal node. is a function of the particular problem instance.

Branching Factor and Solution DepthBranching Factor and Solution Depth

In many cases we can reduce the size of the search tree, by eliminating some simple duplicate paths.

In general,

we never apply an operator and it’s inverse in succession, since no optimal path can contain such a sequence.

Therefore we never list the parent of a node as one of his children.

This reduces the branching factor of the problem by approximately 1.

Eliminating Duplicate NodesEliminating Duplicate Nodes

Types of Problem SpacesTypes of Problem Spaces

There are several types of problem spaces:

State space (OR graphs) Problem Reduction Space (AND graphs) Games (AND/OR Graphs)

State SpaceState Space The statesstates represent situations of the problem. The operatorsoperators represent actions in the world.

forward searchforward search: the root of the problem space represents the start state, and the search proceeds forward to a goal state.

backward searchbackward search : the root of the problem space represents the goal state, and the search proceeds backward to the initial state.

For example: in Rubik’s Cube and the Sliding-Tile Puzzle, either a forward or backward search are possible.

This can be considered as an OR graph because the solution picks one branch at each node.

In a problem reduction space, the nodes represent problems to be solved or goals to be achieved, and the edges represent the decomposition of the problem into subproblems.

This is best illustrated by the example of the Towers of Hanoi problem.

This is an AND graph

Problem Reduction SpaceProblem Reduction Space

CA BA CB

2AB

3AC

1AC 2BC

1AC 1AB 1CB 1BA 1BC 1AC

Problem Reduction SpaceProblem Reduction Space The root node, labeled “3AC” represents the original

problem of transferring all 3 disks from peg A to peg C.

The goal can be decomposed into three subgoals: 2AB, 1AC, 2BC. In order to achieve the goal, all 3 subgoals must be achieved.


3AC

CA B


3AC

2AB

1AC

CA B


3AC

2AB

1AC 1AB

CA B


CA B

3AC

2AB

1AC 1AB 1CB


CA B

3AC

2AB

1AC 1AB 1CB

1AC


3AC

2AB

1AC 1AB 1CB

1AC 2BC

1BA

CA B


3AC

2AB

1AC 1AB 1CB

1AC 2BC

1BA 1BC

CA B


CA B

3AC

2AB

1AC 1AB 1CB

1AC 2BC

1BA 1BC 1AC

An AND graph consists entirely of AND nodes, and in order to solve a problem represented by it, you need to solve the problems represented by allall of his children (Hanoi towers example).

An OR graph consists entirely of OR nodes, and in order to solve the problem represented by it, you only need to solve the problem represented by oneone of his children (Eight Puzzle Tree example).

AND/OR GraphsAND/OR Graphs

An AND/OR graph consists of both AND nodes and OR nodes.

One source of AND/OR graphs is a problem where the effect of an action cannot be predicted in advanced, as in an interaction with the physical world.

Example:

the counterfeit-coin problem.

AND/OR GraphsAND/OR Graphs

Two-Player Game TreesTwo-Player Game Trees The most common source of AND/OR graphs is

2-player perfect-information games. Example: Game Tree for 5-Stone Nim:

5

4 3

3 2 2 1

2 1 1 0 1 0 0

1 0 0 0 0

0

OR nodes

AND nodes

x

x

Solution subgraph for AND/OR treesSolution subgraph for AND/OR trees

In general, a solution to an AND/OR graph is a sub graph with the following properties:

It contains the root node. For every OR node included in the solution sub graph, one child is included.

For every AND node included in the solution subgraph, all the children are included.

Every terminal node in the solution subgraph is a solved node.

SolutionsSolutions The notion of a solution is different for the different

problem types:

For a path-finding problem, an optimal solution is a solution of lowest cost.

For a CSP, if there is a cost function associated with a state of the problem, an optimal solution would again be one of lowest cost.

For a 2-player game: If the solution is simply a move to be made, an optimal

solution would be the best possible move that can be made in a given situation.

If the solution is considered a complete strategy subgraph, then an optimal solution might be one that forces a win in the fewest number of moves in the worst case.

Combinatorial ExplosionCombinatorial Explosion The number of different states of the problems above

is enormous, and grows extremely fast as the problem size increases.

Examples for the number of different possibilities:

# of nodesGame9!8-Puzzle16!15-Puzzle25!24-Puzzle3,265,920Rubik’s Cube- 2x2x24.32x1019Rubik’s Cube- 3x3x3n!N-city TSP1020Checkers1040Chessn!N-Queens

The combinatorial explosion of the number of possible states as a function of problem size is a key characteristic that separates artificial intelligence search algorithms in other areas of computer science.

Techniques that rely on storing all possibilities in memory, or even generating all possibilities, are out of the question except for the smallest of these problems. As a result, the problem-space graphs of AI problems are usually represented implicitly by specifying an initial state and a set of operators to generate new states.

Combinatorial ExplosionCombinatorial Explosion

Search AlgorithmsSearch Algorithms This course will focus on systematic search

algorithms that are applicable to the different problem types, so that a central concern is their efficiency.

There are 4 primary measures of efficiency of a search algorithm: The completeness of the algorithm

(whether it returns a solution in the end??)

The quality of the solution returned.

Optimal, near-optimal (e-optimal) or sub optimal

The running time of the algorithm.

The amount of memory required by the algorithm

Chapter 2 : brute force searches.

Chapter 3 : heuristic search algorithms.

Chapter 4 : search algorithms that run in linear space.

Chapter 5 : search algorithms for the case where individual moves of a solution must be executed in the real world before a complete optimal solution can be computed.

Chapter 6 : methods for deriving the heuristic function

Chapter 7 : 2-player perfect-information games.

Chapter 8 : analysis of alpha-beta minimax.

Chapter 9 : games with more than 2 players.

Chapter 10: the decision quality of minimax.

Chapter 11: automatic learning of heuristic functions for 2-player games.

Chapter 12: Constraint Satisfaction Problems.

Chapter 13: parallel search algorithms.

The Next ChaptersThe Next Chapters

Brute-Force SearchBrute-Force Search The most general search algorithms are Brute-Force

searches, that do not use any domain specific knowledge.

It requires: a state description a set of legal operators an initial state a description of the goal state.

We will assume that all edges have unit cost.

To generateTo generate a node means to create the data structure corresponding to that node.

To expandTo expand a node means to generate all the children of that node.

Breadth-First Search (BFS)Breadth-First Search (BFS) BFS expands nodes in order of their depth from the

root.

Generating one level of the tree at a time.

Implemented by first-in first-out (FIFO) queue.

At each cycle the node at the head of the queue is removed and expanded, and its children are placed at the end of the queue.

Breadth-First Search (BFS)Breadth-First Search (BFS)

The numbers represent the order generated by BFS

0

1 2

c3 4 c c5 6

7 813

14

c910

11

12

Solution QualitySolution Quality

BFS continues until a goal node is generated.

Two ways to report the actual solution path:

Store with each node the sequence of moves made to reach that node.

Store with each node a pointer back to his parent - more memory efficient.

If a goal exists in the tree BFS will find a shortest path to a goal.

Time ComplexityTime Complexity We assume :

each node can be generated in constant time

function of the branching factor b and the solution depth d

number of nodes depends on where at level d the goal node is found.

the worst case - have to generate all the nodes at level d.

N(b,d) - total number of nodes generated.

Time ComplexityTime Complexity

Time Complexity of BFS is

O(bd)

Time Complexity of BFS is

O(bd)

N(b,d)= 1+b+b +b +...+b

b N(b,d)= b+b +b +...+ b

b N(b,d)- N(b,d) = -1 + b

N(b,d) (b -1) = b 1

N(b,d) = b 1

b -1

N(b,d) b 1

b -1b

b

b 1

2 3 d

2 3 d+1

d+1

d+1

d+1

d+1d

Space ComplexitySpace Complexity To report the solution we need to store all nodes

generated.

Example:Machine speed = 1GHz

Generated a new state in 100 Instruction

10^7 nodes/sec

node size = 4 bytes

total memory = 2GB=2*10^9 byte

nodes’ capacity=2*10^9/4=500*10^6

After 50 seconds = 1 minute the memory is exhausted !

Space Complexity=Time Complexity= O(bd)Space Complexity=Time Complexity= O(bd)

Space ComplexitySpace Complexity The previous example based on current technology.

The problem won’t go away since as memories increase in size, processors get faster and our appetite to solve larger problem grows.

BFS and any algorithm that must store all the nodes are severely space-bound and will exhaust the memory in minutes.

Depth-First Search (DFS)Depth-First Search (DFS)

DFS generates next a child of the deepest node that has not been completely expanded yet.

First ImplementationFirst Implementation is by last in first out (LIFO) stack. AKA depth-first expansion

At each cycle the node at the head of the stack is removed and expanded, and its children are placed on top of the stack.

DFS - stack implementationDFS - stack implementation

The numbers represent the order generated by DFS

0

1 2

c3 4 c c910

5 613

14

c7 8 11

12

Depth-First Search (DFS)Depth-First Search (DFS)

Second ImplementationSecond Implementation is recursive.

The recursive function takes a node as an argument and perform DFS below that node. This function will loop through each of the node’s children and make a recursive call to perform a DFS below each of the children in turn.

AKA depth-first generation

DFS - recursive implementationDFS - recursive implementation

The numbers represent the order generated by DFS

0

1 8

c2 5 c c912

3 413

14

c6 7 10

11

Space ComplexitySpace Complexity

The space complexity is linear in the maximum search depth.

d is the maximum depth of search and b is the Branching Factor.

Depth-first generation stores O(d) nodes.

Depth-first expansion stores O(b^d) nodes.

Depth-first generation stores O(d) nodes.

Depth-first expansion stores O(b^d) nodes.

DFS is time-limited rather than space-limited.

Time Complexity and Solution Time Complexity and Solution QualityQuality

DFS generate the same set of nodes as BFS.

However, on infinite tree DFS may not terminate.

For example: Eight puzzle contain 181,440 nodes but every path is infinitely long and thus DFS will never end.

Time Complexity of DFS is

O(bd)

Time Complexity of DFS is

O(bd)

Time Complexity and Solution Time Complexity and Solution QualityQuality

The solution for infinite tree is to impose an artificial Cutoff depth on the search.

If the chosen cutoff depth is less than d, the algorithm won’t find a solution.

If the cutoff depth is greater than d, time complexity is larger than BFS.

The first solution DFS found may not be the optimal one.

Depth-First Iterative-Deepening Depth-First Iterative-Deepening (DFID) (DFID)

Combines the best features of BFS and DFS.

DFID first performs a DFS to depth one. Then starts over executing DFS to depth two. Continue to run DFS to successively greater depth until a solution is found.

The numbers represent the order generated by DFID

0

1,3,9

2,6,16

c4,10

5,13 c

7,17

8,20

11 12 21 22c14 15 18 19

Depth-First Iterative-Deepening Depth-First Iterative-Deepening (DFID) (DFID)


DFID never generates a node until all shallower nodes have already been generated.

The first solution found by DFID is guaranteed to be along a shortest path.


Like DFS, at any given point DFID saving only a stack of nodes.

The space complexity is only O(d)The space complexity is only O(d)

Time ComplexityTime Complexity DFID do not waste a great deal of time in the

iterations prior to the one that finds a solution. This extra work is usually insignificant.

The ratio of the number of nodes generated by DFID to those generated by BFS on a tree is:

bb

bb

b

b

b

bd d

1 1 1

2

/

The total number of nodes generated by DFID is The total number of nodes generated by DFID is bb

bd

1

2

Optimality of DFIDOptimality of DFID

Theorem 2.1 : DFID is asymptotically optimal in terms of time and space among all brute-force shortest-path algorithms on a tree with unit edge costs.

Steps of proof:

verify that DFID is optimal in terms of: solution quality time complexity space complexity

Optimality of DFID- Optimality of DFID- Solution QualitySolution Quality

Since DFID generates all nodes at given level before any nodes at next deeper level, the first solution it finds is arrived at via an optimal path.

Assume the contrary that Algorithm A is: Running on Problem P.

Finding a shortest path to a goal.

Running less than b^d .

Since its running time is less than b^d and there are b^d nodes at depth d, there must be at least one node n at depth d that A doesn’t generate when solve P.

Optimality of DFID- Optimality of DFID- Time ComplexityTime Complexity

New Problem P’. P’ identical to P except that n is the goal.

A examines the same nodes in both P and P’.

A doesn’t examine the node n.

A fail to solve P’ since n is the only goal node.

There is no Algorithm runs better than O(b^d ).

Since DFID takes O(b^d ) time, its time complexity is asymptotically optimal.

Optimality of DFID- Optimality of DFID- Time ComplexityTime Complexity

There is a well-known result from C.S that: Any algorithm that takes f(n) time must use at least

logf(n) space. We have already seen that any brute-force search

must take at least bd time, any such algorithm must use at least log(b^d) space, which is O(d) space.

Since DFID uses O(d) space, it’s asymptotically optimal in space.

Optimality of DFID- Optimality of DFID- Space ComplexitySpace Complexity

Graph with CyclesGraph with Cycles On graph with cycles BFS can be more

efficient because it can detect all duplicate nodes whereas a DFS can’t.

The complexity of BFS grows only as a numbers of nodes at a given depth.

The complexity of DFS depends on the numbers of paths of a given length.

In a graph with a large number of very short cycles, BFS is preferable to DFS, if sufficient memory is available.

In a square grid with radius r, there is O(r^2) nodes and O(4^r) paths.

Graph with CyclesGraph with Cycles

Pruning duplicate Nodes in DFSPruning duplicate Nodes in DFS Eliminate the parent of each node as one of its children.

Easily done with FSM.

Reduce the branching factor from 4 to 3.

start right

up

left

down

Pruning duplicate Nodes in DFSPruning duplicate Nodes in DFS

More Efficient FSM allowed sequences of moves up only or down only . And sequences of moves left only or right only.

Time complexity of DFS controlled by this FSM, like BFS, is O(r2).

start rightleft

up

down

Node Generation TimesNode Generation Times BFS, DFS, DFID generates asymptotically the

same number of nodes on a tree.

DFS, DFID are more efficient than BFS.

The amount of time to generate a node is proportional to the size of the state representation.

If DFS is implemented as a recursive program, a move would require only a constant time, instead of time linear in the number of tiles.

This advantage of DFS, becomes increasingly significant the larger state description.

Backward Chaining/SearchBackward Chaining/Search

The root node represent the goal state, and we could search backward until we reach the initial state.

Requirements: Requirements: The goal state represented explicitly. We be able to reason backwards about the

operators.

Bidirectional SearchBidirectional Search Main idea:Main idea: Simultaneously search forward from the initial state and backward

from the goal state, until the two search frontiers meet at a

common state.

S G


Bidirectional search guarantees finding a shortest path from the initial state to the goal state, if one exist.

Assume that there is a solution of length d and the both searches are breadth-first.

When the forward search has proceeded to depth k, its frontier will contain all nodes at depth k from the initial state.

Solution QualitySolution Quality When the backward search has proceeded to depth

d-k, its frontier will contain all states at depth d-k from the goal state.

State s reached along an optimal solution path at depth k from the initial state and at depth d-k from the goal state.

The state s is in the frontier of both searches and the algorithm will find the match and return the optimal solution.

Time ComplexityTime Complexity If the two search frontiers meet in the middle, each

search proceeds to depth d/2 before they meet.

But this isn’t the asymptotic time complexity because we have to compare every new node with the opposite search frontier.

Naively, compare each node with the all opposite search frontier cost us O(bd/2).

The total number of nodes generated is O(2bd/2) = O(bd/2).The total number of nodes generated is O(2bd/2) = O(bd/2).

Time ComplexityTime Complexity The time complexity of the whole algorithm

becomes O(bd).

More efficiently is using hash tables.

In the average case:

The time to do hashing and compare will be constant.

the asymptotically time complexity is O(bd/2).


The simplest implementation of bidirectionalbidirectional is to use one search in BFS, and the search in other direction can be DFS such as DFID.

At least one of the frontiers must be sorted in memory.

The space complexity of bidirectional search is dominated by BFS search and is O(bd/2).

BidirectionalBidirectional search is space bound.

BidirectionalBidirectional search is much more time efficient than unidirectional search.

Perimeter SearchPerimeter Search

A special kind of bidirectional search.

A bredth first search up to depth d is performed backwards from the goal state. (this is the perimeter P)

Then any search is performed from the initial state towards the perimeter nodes.

Once the perimeter is reached the search can stop.

GS