CS 8520: Artificial Intelligence Search Paula Matuszek Fall, 2005 Slides based on Hwee Tou Ng,...

CS 8520: Artificial Intelligence

Search

Paula Matuszek

Fall, 2005

Slides based on Hwee Tou Ng, aima.eecs.berkeley.edu/slides-ppt, which are in turn based on Russell, aima.eecs.berkeley.edu/slides-pdf.

Paula Matuszek, CSC 8520, Fall 2005. Based in part on aima.eecs.berkeley.edu/slides-ppt and www.cs.umbc.edu/671/fall03/slides/c5-6_inf_search.ppt 2

Problem-solving agents

Example: Romania• On holiday in Romania; currently in Arad.• Flight leaves tomorrow from Bucharest• Formulate goal:

– be in Bucharest

• Formulate problem:– states: various cities– actions: drive between cities

• Find solution:– sequence of cities, e.g., Arad, Sibiu, Fagaras, Bucharest

Example: Romania

Problem types• Deterministic, fully observable single-state problem

– Agent knows exactly which state it will be in; solution is a sequence

• Non-observable sensorless problem (conformant problem)– Agent may have no idea where it is; solution is a sequence

• Nondeterministic and/or partially observable contingency problem– percepts provide new information about current state– often interleave} search, execution

• Unknown state space exploration problem

Example: vacuum world• Single-state, start in #5.

Solution?

Example: vacuum world• Single-state, start in #5.

Solution? [Right, Suck]

• Sensorless, start in {1,2,3,4,5,6,7,8} e.g., Right goes to {2,4,6,8} Solution?

Example: vacuum world• Sensorless, start in

{1,2,3,4,5,6,7,8} e.g., Right goes to {2,4,6,8} Solution? [Right,Suck,Left,Suck]

• Contingency – Nondeterministic: Suck may

dirty a clean carpet

– Partially observable: location, dirt at current location.

– Percept: [L, Clean], i.e., start in #5 or #7

Solution?

Example: vacuum world• Sensorless, start in

{1,2,3,4,5,6,7,8} e.g., Right goes to {2,4,6,8} Solution? [Right,Suck,Left,Suck]

• Contingency – Nondeterministic: Suck may

dirty a clean carpet

– Partially observable: location, dirt at current location.

– Percept: [L, Clean], i.e., start in #5 or #7

Solution? [Right, if dirt then Suck]

Single-state problem formulationA problem is defined by four items:

1. initial state e.g., "at Arad"2. actions or successor function S(x) = set of action–state

pairs • e.g., S(Arad) = {<Arad Zerind, Zerind>, … }

3. goal test, can be• explicit, e.g., x = "at Bucharest"• implicit, e.g., Checkmate(x)

4. path cost (additive)• e.g., sum of distances, number of actions executed, etc.• c(x,a,y) is the step cost, assumed to be ≥ 0

• A solution is a sequence of actions leading from the initial state to a goal state

Selecting a state space• Real world is absurdly complex

state space must be abstracted for problem solving

• (Abstract) state = set of real states• (Abstract) action = complex combination of real actions

– e.g., "Arad Zerind" represents a complex set of possible routes, detours, rest stops, etc.

• For guaranteed realizability, any real state "in Arad“ must get to some real state "in Zerind"

• (Abstract) solution = – set of real paths that are solutions in the real world

• Each abstract action should be "easier" than the original problem

Vacuum world state space graph

• States? Actions? Goal Test? Path Cost?

Vacuum world state space graph

• states? integer dirt and robot location • actions? Left, Right, Suck• goal test? no dirt at all locations• path cost? 1 per action

Example: The 8-puzzle

• states?• actions?• goal test?• path cost?

Example: The 8-puzzle

• states? locations of tiles • actions? move blank left, right, up, down • goal test? = goal state (given)• path cost? 1 per move

[Note: optimal solution of n-Puzzle family is NP-hard]

Example: robotic assembly

• states? • actions?• goal test?• path cost?

Example: robotic assembly

• states?: real-valued coordinates of robot joint angles parts of the object to be assembled

• actions?: continuous motions of robot joints• goal test?: complete assembly• path cost?: time to execute

Tree search algorithms• Basic idea:

– offline, simulated exploration of state space by generating successors of already-explored states (a.k.a.~expanding states)

Tree search example

Implementation: general tree search

Implementation: states vs. nodes• A state is a (representation of) a physical configuration• A node is a data structure constituting part of a search tree

includes state, parent node, action, path cost g(x), depth

• The Expand function creates new nodes, filling in the various fields and using the SuccessorFn of the problem to create the corresponding states.

Search strategies• A search strategy is defined by picking the order of

node expansion. (e.g., breadth-first, depth-first) • Strategies are evaluated along the following

dimensions:– completeness: does it always find a solution if one exists?– time complexity: number of nodes generated– space complexity: maximum number of nodes in memory– optimality: does it always find a least-cost solution?

• Time and space complexity are measured in terms of – b: maximum branching factor of the search tree– d: depth of the least-cost solution– m: maximum depth of the state space (may be infinite)

Uninformed search strategies• Uninformed search strategies use only the

information available in the problem definition

• Breadth-first search

• Uniform-cost search

• Depth-first search

• Depth-limited search

• Iterative deepening search

Implementation: general tree search

Breadth-first search• Expand shallowest unexpanded node

• Implementation:– fringe is a FIFO queue, i.e., new successors go

at end

• Implementation:– fringe is a FIFO queue, i.e., new successors

go at end

at end

Properties of breadth-first search• Complete? Yes (if b is finite)• Time? 1+b+b2+b3+… +bd + b(bd-1) = O(bd+1)• Space? O(bd+1) (keeps every node in memory)• Optimal? Yes (if cost = 1 per step)

• Space is the bigger problem (more than time)

Uniform-cost search• Expand least-cost unexpanded node• Implementation:

– fringe = queue ordered by path cost

• Equivalent to breadth-first if step costs all equal• Complete? Yes, if step cost >= epsilon (otherwise can

loop)• Time and space? O(bceiling(C*/ epsilon)) where C* is the cost of

the optimal solution and epsilon is the smallest step cost– Can be much worse than breadth-first if many small steps not on

optimal path

• Optimal? Yes – nodes expanded in increasing order of g(n)

Uniform Cost Search

Depth-first search• Expand deepest unexpanded node• Implementation:

– fringe = LIFO queue, i.e., put successors at front

Properties of depth-first search

• Complete? No: fails in infinite-depth spaces, spaces with loops– Modify to avoid repeated states along path

complete in finite spaces

• Time? O(bm): terrible if m is much larger than d– but if solutions are dense, may be much faster than

breadth-first

• Space? O(bm), i.e., linear space!• Optimal? No

Depth-limited search• = depth-first search with depth limit l

• Nodes at depth l have no successors

• Solves problem of infinite depth

• Incomplete

• Recursive implementation:

Iterative deepening search•Repeated Depth-Limited search, incrementing limit l until a solution is found or failure.

•Repeats earlier steps at each new level, so inefficient -- but never more than doubles cost

•No longer incomplete

Iterative deepening search l =0

Properties of iterative deepening

• Complete? Yes

• Time? (d+1)b0 + d b1 + (d-1)b2 + … + bd = O(bd)

• Space? O(bd)

• Optimal? Yes, if step cost = 1

Summary of Algorithms for Ininformed Search

Criterion Breadth-first

Uniform Cost

Depth First

Depth-Limited

Iterative Deepening

Complete?

Yes Yes No No Yes

Time? O(bd+1) O(b(ceilingC*

/epsilon)

O(bm) O(bl) O(bd)

Space? O(bd+1) O(b(ceilingC*

/epsilon)

O(bm) O(bl) O(bd)

Optimal? Yes Yes No No Yes

A Caution: Repeated States• Failure to detect repeated states can turn a linear

problem into an exponential one, or even an infinite one.– For example: 8-puzzle– Simple repeat -- empty square simply moves back and

forth– More complex repeats also possible.

• Save list of expanded states -- the closed list.• Add new state to fringe only if it's not in closed list.

Graph search

Summary: Uninformed Search• Problem formulation usually requires abstracting away

real-world details to define a state space that can feasibly be explored

• Variety of uninformed search strategies

• Iterative deepening search uses only linear space and not much more time than other uninformed algorithms: usual choice

Informed search algorithms

Slides derived in part from

www.cs.berkeley.edu/~russell/slides/chapter04a.pdf, converted to powerpoint by Min-Yen Kan, National University of Singapore,

and from www.cs.umbc.edu/671/fall03/slides/c5-6_inf_search.ppt, Marie DesJardins, University of Maryland Baltimore County.

Review: Tree search

• A search strategy is defined by picking the order of node expansion

Heuristic Search• Uninformed search is generic; choice of node to

expand is dependent on shape of tree and strategy for node expansion.

• Sometimes domain knowledge can help us make a better decision.

• For the Romania problem, eyeballing it results in looking at certain cities first because they "look closer" to where we are going.

• If that domain knowledge can be captured in a heuristic, search performance can be improved by using that heuristic.

• This gives us an informed search strategy.

So What's A Heuristic?Webster's Revised Unabridged Dictionary (1913) (web1913)

Heuristic \Heu*ris"tic\, a. [Gr. ? to discover.] Serving to discover or find out.

The Free On-line Dictionary of Computing (15Feb98) heuristic 1. <programming> A rule of thumb, simplification or

educated guess that reduces or limits the search for solutions in domains that are difficult and poorly understood. Unlike algorithms, heuristics do not guarantee feasible solutions and are often used with no theoretical guarantee. 2. <algorithm> approximation algorithm.

From WordNet (r) 1.6 heuristic adj 1: (computer science) relating to or using a heuristic

rule 2: of or relating to a general formulation that serves to guide investigation [ant: algorithmic] n : a commonsense rule (or set of rules) intended to increase the probability of solving some problem [syn: heuristic rule, heuristic program]

Heuristics• For search it has a very specific meaning:

– All domain knowledge used in the search is encoded in the heuristic function h.

• Examples:– Missionaries and Cannibals: Number of people on starting river bank– 8-puzzle: Number of tiles out of place – 8-puzzle: Sum of distances from goal – Romania: straight-line distance from city to Bucharest

• In general:– h(n) >= 0 for all nodes n – h(n) = 0 implies that n is a goal node – h(n) = infinity implies that n is a deadend from which a goal cannot be

reached

• h is some estimate of how desirable a move is, or how close it gets us to our goal

Best-first search• Order nodes on the nodes list by

increasing value of an evaluation function, f(n), that incorporates domain-specific information in some way.

• This is a generic way of referring to the class of informed methods.

Best-first search• Idea: use an evaluation function f(n) for each node

– estimate of "desirability"Expand most desirable unexpanded node

• Implementation:Order the nodes in fringe in decreasing order of desirability

• Special cases:– greedy best-first search– A* search

Romania with step costs in km

Greedy best-first search• Evaluation function f(n) = h(n) (heuristic)

• = estimate of cost from n to goal

• e.g., hSLD(n) = straight-line distance from n to Bucharest

• Greedy best-first search expands the node that appears to be closest to goal

Greedy best-first search example

Properties of greedy best-first search

• Complete? No – can get stuck in loops, e.g., Iasi Neamt Iasi Neamt

• Time? O(bm), but a good heuristic can give dramatic improvement

• Space? O(bm) -- keeps all nodes in memory

• Optimal? No• Remember: Time and space complexity are measured in

terms of – b: maximum branching factor of the search tree– d: depth of the least-cost solution– m: maximum depth of the state space (may be infinite)

A* search• Idea: avoid expanding paths that are already

expensive

• Evaluation function f(n) = g(n) + h(n)

• g(n) = cost so far to reach n

• h(n) = estimated cost from n to goal

• f(n) = estimated total cost of path through n to goal

A* search example

Admissible heuristics• A heuristic h(n) is admissible if for every node n,

h(n) <= h*(n), where h*(n) is the true cost to reach the goal state from n.

• An admissible heuristic never overestimates the cost to reach the goal, i.e., it is optimistic.

• This means that we won't ignore a better path because we think the cost is too high. (If we underestimate it we wil learn that when we explore it.)

• Example: hSLD(n) (never overestimates the actual road distance)

Admissible heuristicsE.g., for the 8-puzzle:• h1(n) = number of misplaced tiles• h2(n) = total Manhattan distance

(i.e., no. of squares from desired location of each tile)

• h1(S) = ? • h2(S) = ?

Admissible heuristicsE.g., for the 8-puzzle:• h1(n) = number of misplaced tiles• h2(n) = total Manhattan distance(i.e., no. of squares from desired location of each tile)

• h1(S) = ? 8• h2(S) = ? 3+1+2+2+2+3+3+2 = 18

Properties of A*• If h(n) is admissible

• Complete? Yes (unless there are infinitely many nodes with f ≤ f(G) )

• Time? Exponential in [relative error in h * length of solution]

• Space? Keeps all nodes in memory

• Optimal? Yes; cannot expand f i+1 until f i is finished.

Some observations on A*• Perfect heuristic: If h(n) = h*(n) for all n, then only nodes on

the optimal solution path will be expanded. So, no extra work will be performed.

• Null heuristic: If h(n) = 0 for all n, then this is an admissible heuristic and A* acts like Uniform-Cost Search.

• Better heuristic: If h1(n) < h2(n) <= h*(n) for all non-goal nodes, then h2 is a better heuristic than h1 – If A1* uses h1, and A2* uses h2, then every node expanded by A2* is

also expanded by A1*. – In other words, A1 expands at least as many nodes as A2*. – We say that A2* is better informed than A1*, or A2* dominates A1*

• The closer h is to h*, the fewer extra nodes that will be expanded

What’s a good heuristic? How do we find one?

• If h1(n) < h2(n) <= h*(n) for all n, then both are admissible and h2 is better than (dominates) h1.

• Relaxing the problem: remove constraints to create a (much) easier problem; use the solution cost for this problem as the heuristic function

• Combining heuristics: take the max of several admissible heuristics: still have an admissible heuristic, and it’s better!

• Identify good features, then use a learning algorithm to find a heuristic function: may lose admissibility

Relaxed problems• A problem with fewer restrictions on the actions is

called a relaxed problem• The cost of an optimal solution to a relaxed

problem is an admissible heuristic for the original problem

• If the rules of the 8-puzzle are relaxed so that a tile can move anywhere, then h1(n) gives the shortest solution

• If the rules are relaxed so that a tile can move to any adjacent square, then h2(n) gives the shortest solution

Some Examples of Heuristics?• 8-puzzle?

• Mapquest driving directions?

• Minesweeper?

• Crossword puzzle?

• Making a medical diagnosis?

• ??

Local search algorithms• In many optimization problems, the path to the

goal is irrelevant; the goal state itself is the solution

• State space = set of "complete" configurations• Find configuration satisfying constraints, e.g., n-

queens

• In such cases, we can use local search algorithms• Keep a single "current" state, try to improve it

Example: n-queens• Put n queens on an n x n board with no two

queens on the same row, column, or diagonal

Hill-climbing search• If there exists a successor s for the current state n such that

– h(s) < h(n)

– h(s) <= h(t) for all the successors t of n,

• then move from n to s. Otherwise, halt at n.

• Looks one step ahead to determine if any successor is better than the current state; if there is, move to the best successor.

• Similar to Greedy search in that it uses h, but does not allow backtracking or jumping to an alternative path since it doesn’t “remember” where it has been.

• Not complete since the search will terminate at "local minima," "plateaus," and "ridges."

Hill-climbing search• "Like climbing Everest in thick fog with

amnesia"

Hill climbing example

start goal

h = -3

h = -2

h = -1

h = 0h = -4

f(n) = -(number of tiles out of place)

Drawbacks of hill climbing• Problems:

– Local Maxima: peaks that aren’t the highest point in the space

– Plateaus: the space has a broad flat region that gives the search algorithm no direction (random walk)

– Ridges: flat like a plateau, but with dropoffs to the sides; steps to the North, East, South and West may go down, but a step to the NW may go up.

• Remedy: – Random restart.

• Some problem spaces are great for hill climbing and others are terrible.

Hill-climbing search• Problem: depending on initial state, can get

stuck in local maxima

Example of a local maximum

start goal

Simulated annealing• Simulated annealing (SA) exploits an analogy between the way

in which a metal cools and freezes into a minimum-energy crystalline structure (the annealing process) and the search for a minimum [or maximum] in a more general system.

• SA can avoid becoming trapped at local minima.

• SA uses a random search that accepts changes that increase objective function f, as well as some that decrease it.

• SA uses a control parameter T, which by analogy with the original application is known as the system “temperature.”

• T starts out high and gradually decreases toward 0.

Simulated annealing search• Idea: escape local maxima by allowing some "bad" moves but

gradually decrease their frequency

Properties of simulated annealing search• One can prove: If T decreases slowly enough, then

simulated annealing search will find a global optimum with probability approaching 1

• Widely used in VLSI layout, airline scheduling, etc

Local beam search• Keep track of k states rather than just one

• Start with k randomly generated states

• At each iteration, all the successors of all k states are generated

• If any one is a goal state, stop; else select the k best successors from the complete list and repeat.

Summary: Informed search• Best-first search is general search where the minimum-cost nodes

(according to some measure) are expanded first. • Greedy search uses minimal estimated cost h(n) to the goal state as

measure. This reduces search time, but is neither complete nor optimal. • A* search combines uniform-cost search and greedy search: f(n) =

g(n) + h(n). A* handles state repetitions and h(n) never overestimates. – A* is complete, optimal and optimally efficient, but its space

complexity is still bad. – The time complexity depends on the quality of the heuristic

function.

• Local Search techniques are useful when you don't care about path, only result. Examples include– Hill-climbing – Simulated annealing– Local Beam Search

CS 8520: Artificial Intelligence Search Paula Matuszek Fall, 2005 Slides based on Hwee Tou Ng,...

Documents

©2012 Paula Matuszek CSC 9010: Text Mining Applications: Document-Based Techniques Dr. Paula Matuszek Paula.Matuszek@villanova.edu Paula.Matuszek@gmail.com

Creating the Presentation Layer DECO 3001 Tutorial 2 – Style Presented by Ji Soo Yoon 26 February 2004 Slides adopted from matuszek/cit597-

CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Natural Language Processing Introduction Paula Matuszek Spring, 2010

©2012 Paula Matuszek CSC 9010: Text Mining Applications Dr. Paula Matuszek Paula.Matuszek@villanova.edu Paula.Matuszek@gmail.com (610) 647-9789

CSC 8520 Fall, 2005. Paula Matuszek Slides taken in part from: fkurfess/Courses/CSC-481/W03/Slides/3-Knowledge-Representation.ppt

CS 8520: Artificial Intelligence Intelligent Agents Paula Matuszek Fall, 2008 Slides based on Hwee Tou Ng, aima.eecs.berkeley.edu/slides-ppt, which are

1 CSC 4510, Spring 2012. © Paula Matuszek 2012. CSC 4510 Support Vector Machines (SVMs)

1 CSC 8520 Spring 2013. Paula Matuszek CS 8520: Artificial Intelligence Machine Learning 1 Paula Matuszek Spring, 2013

Chapter 18, Sections 1{3 - Artificial Intelligence: A ...aima.eecs.berkeley.edu/slides-pdf/chapter18.pdf · Chapter 18, Sections 1{3 23. Choosing an attribute Idea: a good attribute

More Intro to CLIPS Paula Matuszek CSC 9010, Spring, 2011

Welcome to the Computer and Information Technology program matuszek

©2012 Paula Matuszek CSC 9010: Text Mining Applications Fall, 2012 Introduction to GATE Dr. Paula Matuszek Paula.Matuszek@gmail.com Taken partially from

Heapsort matuszek/cit594-2008/ Based off slides by: David Matuszek

1 CSC 9010 Spring, 2011. Paula Matuszek Slides modified from Natasha Noy, protege.stanford.edu/amia2003/AMIA2003Tutorial.ppt CS 9010: Knowledge-Based Systems

Steven Matuszek UMBC CMSC 691M Spring 2003 Extending Multi-Agent Coordination in

Bookkeeping Bayes Nets - Inspiring Innovation...1 © Cynthia Matuszek – UMBC CMSC 671 Bayes Nets AI Class 10 (Ch. 14.1–14.4.2; skim 14.3) Based on slides by Dr. Marie desJardin

CSC 9010 Spring, 2006. Paula Matuszek 1 CS 9010: Semantic Web Inference and Rules Paula Matuszek Spring, 2006

1 CSC 8520 Spring 2010. Paula Matuszek CS 8520: Artificial Intelligence Logical Agents and First Order Logic Paula Matuszek Spring 2010

CS 8520: Artificial Intelligence Robotics Paula Matuszek Fall, 2008

CS 8520: Artificial Intelligence Intelligent Agents and Search Paula Matuszek Fall, 2005 Slides based on Hwee Tou Ng, aima.eecs.berkeley.edu/slides-ppt,