Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Lecture 2: Introduction to AI
What is Artificial Intelligence?
History of AI
The State of the Art
What is AI?
Systems thatthink like humans
Systems thatthink rationally
Systems thatact like humans
Systems thatact rationally
Acting Humanly: The Turing Test
Test of intelligence: The Imitation Game
Anticipated all major arguments against AI
Broke down AI into knowledge, reasoning, language understanding, and learning
Predicted the success of AI: “[By 2000, programs will be able to] play the imitation game so well that an average interrogator will not have more than 70% chance of making the right identification after 5 minutes of questioning.”
Thinking Humanly: Cognitive Science
Requires understanding of the biological processed of human thought
What level of abstraction is appropriate?
Thinking Rationally: Laws of Thought
Aristotle: What are correct arguments and thought processes?
From mathematics, logic, and philosophy through to modern AI
But, is all intelligent behavior logical?
What is the purpose of thinking?
Acting Rationally: Do the Right Thing
The right thing: That which is expected to maximize goal achievement
But, the right thing doesn’t necessarily involve thinking (e.g. blinking).
(Modern) History of AI
1943: McCulloch & Pitts’ circuit model of the brain1950: Turing’s “Computing Machinery and Intelligence”1950s: Samuel’s checkers; Newell & Simon’s logic theorist1965: Robinson’s logical reasoning algorithm1960s: Complexity problems1970s: Early knowledge-based systems1980s: Expert systems1990s: Agent model and scientific formalization
State of the Art: Can a Machine...
... plan and control the operation of a spacecraft?
... play a master-level game of chess?
... steer a car on a cross-country trip of the US?
... diagnose lymph-node pathology at an expert level?
... manage the logistics planning and scheduling of over 50,000 vehicles, cargo, and soldiers during wartime?... assist surgeons during microsurgery?... solve crossword puzzles better than humans?... unload any dishwasher and put everything away?
Problem Solving via Uninformed Search
Traveling through RomaniaCurrently in AradFlight leaves tomorrow from BucharestGoal: be in BucharestProblem:
states = various citiesactions = drive between cities
Solve:find the sequence of cities to drive through: Arad, Sibiu, Fagaras, Bucharest
Stating the Problem
Initial State: at Arad
Successor Function: S(state) = { (action,state,cost), ... }
Goal Test: Goal(state) = True if state is goal
Path Cost: Sum of step costs c(x,a,y) = g(n)
Solution: Sequence of actions from initial to goal state
States vs Nodes
State: representation of a physical configurationNode: data structure in search graph including
stateparent nodeactionpath cost -- g(x)depth
Implementing States
(defun make-node (&key state parent action (path-cost 0) (depth 0))(list state parent action path-cost depth))
(defun node-state (node) (car node))(defun node-parent (node) (cadr node))(defun node-action (node) (caddr node))(defun node-path-cost (node) (cadddr node))(defun node-depth (node) (car (cddddr node)))
OR
(defstruct node state (parent nil) (action nil) (path-cost 0) (depth 0))
Search Strategies
Defined by picking the order of node expansionStrategies evaluated by:
completeness: does it always find a solution if one exists?time complexity: number of nodes generatedspace complexity: maximum number of nodes in memoryoptimality: does it always find a least cost solution?
Time and space complexity are measured in terms of:b: maximum branching factor of the search tree (graph)d: depth of least-cost solutionm: maximum depth of the state space (may be ∞)
Uninformed Search
Breadth-First
Uniform-Cost
Depth-First
Depth-Limited
Iterative Deepening
General Search
(pause here to show lisp code)
Breadth-First Search
Fringe is FIFO queue
Complete (if b is finite)
Time complexity = O(bd+1)
Space complexity = O(bd+1) -- keeps all nodes in memory
Optimal if unit cost
ZerindZerindTimisoaraSibiu TimisoaraSibiu
Arad Oradea
Arad Lugaj
Arad
Fagaras
Riminiu V
Oradea
Arad OradeaArad Oradea
Arad LugajArad Lugaj
Arad
Fagaras
Riminiu V
Oradea
Arad
Fagaras
Riminiu V
Oradea
Sibiu Timisoara Zerind
AradArad
Searching the Tree (Graph)AradArad
Uniform-Cost Search
Fringe = queue ordered by path cost, g(n)
Equivalent to breadth-first if step costs are all equal
Complete if step costs are non-negative
Time & space complexity: similar to breadth-first
Optimal because nodes are expanded in increasing order of path cost, g(n)
Depth First Search
Fringe is LIFO queue (stack)i.e. enqueuef = #’(lambda (x y) (append y x))
Incomplete -- fails in infinite-depth spaces or spaces with loops
modify code to avoid repeated statescomplete in finite spaces
Time complexity is O(bm) -- terrible if m >> b, but often much faster than breadth-firstSpace complexity is O(bm) -- linear!!!Optimal -- No. Why?
ZerindZerindTimisoaraSibiu TimisoaraSibiu
Arad Oradea
Arad Lugaj
Arad
Fagaras
Riminiu V
Oradea
Arad OradeaArad Oradea
Arad LugajArad Lugaj
Arad
Fagaras
Riminiu V
Oradea
Arad
Fagaras
Riminiu V
Oradea
Sibiu Timisoara Zerind
AradArad
Searching the Tree (Graph)AradArad
Other Uniformed Searches
Depth-limited is depth-first with a cutoff (nodes at maximum depth have no successors)
Iterative deepening search is iterative calls to depth-limited search, increasing depth cutoff each time
What’s the overhead?
Is it complete? What are the time and space complexities? Is it optimal?
Eliminating Repeated States
Failure to detect repeated states turns linear problem into exponential one
Fixing this turns tree search into graph search
Learning from Observations
Types of Learning
Supervised
Unsupervised
Reinforcement
Given (x, f(x)), what is f?
Inductive Learning
Given a collection of examples (x, f(x)), find h that approximates f
h is the hypothesis
a good hypothesis is one that generalizes well
a hypothesis space, H, is the set of hypotheses that are considered
RegressionLinear regression
Polynomial regression
Ockham’s razor
Realizable and unrealizable learning problems
a learning problem is realizable if H contains f
Tradeoff between expressiveness of H and the complexity of finding simple, consistent h
Classification
Learning a discrete-valued function is called classification
f(vertebrate) = (fish, amphibian, reptile, bird, mammal)
a decision tree takes as input an object described by attributes and returns a decision
Blood Temperature?
warm cold
Has Feathers? Has Scales?
yes no yes no
BIRD MAMMAL Has Fins? AMPHIBIAN
yes no
REPTILEFISH
Boolean ClassificationSpecial case in which f(x) = yes/no
∀x f(x) ⇔ (P1(x) ∨ P2(x) ∨ ... ∨ Pn(x)) where
Pi(x) is the conjunction of all tests along path i from root to leaf, for all i such that leaf is “yes”
Since it is one variable and all predicates are unary, this is actually propositional
Decision trees are fully expressive for the class of propositional languages
Inducing Decision Trees from Examples
Training Set: a list of examples of attribute value pairs with their corresponding outcomes
Test Set: another set of examples -- never used for subsequent training
(defun learn-decision-tree (examples attributes default)(cond
((null examples) default)((all-same-classification examples) (classification (car examples)))((null attributes) (most-frequent-classification examples))(t (let (
(best (choose-attribute attributes examples))(m (most-frequent-classification examples)) )
(make-decision-tree:root best:subtrees (mapcar (lambda (examples-of-each)
(learn-decision-tree examples-of-each(remove best attributes)m) )
(divide-examples examples best) )) )
)))
Information
if the possible answers vi have probabilities P(vi), then the information content of the actual answer is
I(P(v1),...,P(vn)) = ∑i -P(vi)log2P(vi)
I(0.5,0.5) = -0.5(-1) - 0.5(-1) = 1I(0.25, 0.25, 0.25, 0.25) = 4(-0.25)(-2) = 2
How much information is still needed after an attribute is selected?
Remainder(A) = ∑i I(pi/(pi+ni), ni/(pi+ni))(pi+ni)/(p+n)
Gain(A) = I(p/(p+n), n/(p+n)) - Remainder(A)
(choose-attribute attribute examples) chooses the attribute with the largest gain
Rule-Based Expert Systems
Rule-Based SystemsRule-based systems are structured as two “memories”
Knowledge base of rules
Working memory of facts
These can be logic based and use unification to determie rule satisfaction
Rules = definite horn clauses (multiple FOL conditions, single FOL assertion)
Facts = horn facts (FOL predicates with terms including variables)
Or “production systems” that use pattern matching instead of unification
Rules can have multiple assertions as well as multiple conditions
Facts have no variables
Structure of Production Rules
Production rules have a left-hand side (LHS) and a right-hand side (RHS) corresponding to a condition and action respectively.
LHS is a conjunction of patterns with at least one pattern non-negated.
RHS is a set of actions that modify working memory.
Sample Rules
(((agent (location =x) (holding =a)) (arrow (name =a)) (wumpus (location (& =y !x))) (room (name =x) (adjacent =y)))
((MODIFY 1 (holding nil)) (ASSERT shoot (arrow =a) (at =y))) )
(((shoot (arrow =x) (at =y)) (=object (location =y)))
((MODIFY 2 (status dead)) (REMOVE 1)) )
Sample Working Memories
((agent (location 1) (holding A)) (arrow (name A)) (wumpus (location 2)) (room (name 1) (adjacent 2)))
((agent (location 1) (holding nil)) (arrow (name A)) (wumpus (location 2)) (room (name 1) (adjacent 2)) (shoot (arrow A) (at 2)))
((agent (location 1) (holding nil)) (arrow (name A)) (wumpus (location 2) (status dead)) (room (name 1) (adjacent 2)))
(((agent (location =x) (holding =a)) (arrow (name =a)) (wumpus (location (& =y !x))) (room (name =x) (adjacent =y)))
((MODIFY 1 (holding nil)) (ASSERT shoot (arrow =a) (at =y))) )
(((shoot (arrow =x) (at =y)) (=object (location =y)))
((MODIFY 2 (status dead)) (REMOVE 1)) )
Early Ground-Breaking Rule-Based Systems
R1: Digital Equipment Corporation’s expert system for configuring large computer systems – a sales force tool to accurately meet customer requirements. Forward Chaining.
ACE: AT&T Bell Laboratories’ expert system for diagnosing telephone line repairs based on trouble tickets. Forward Chaining.
MYCIN: Bacterial infection diagnosis including explanation system. Backward Chaining.
Dendral: Elucidation of the molecular structure of unknown organic compounds taken from known groups of such compounds. Backward Chaining.
Rule Inference Engines
Like Horn clause forward chaining (or backward chaining) algorithms, rule inference engines find rules that are satisified, fire them, thus altering the working memory, enabling other rules to be satified (and fired, etc...)
Rule inference engines repeat three steps:
Match: determine which rules are satisfied by which facts in working memory producing the conflict set of instantiations
Conflict Resolution: select which instantiation is best
Act: Fire the instantiation, altering working memory
Rule inference halts either when no rule matches or when an explicit halt action is fired
Conflict Resolution
Many alternative algorithms to select which instantiation should fire
Two factors often used to sort select the instantiation are recency and specificity
Recency: time tags on WMEs
Specificity: How many individual values matched
Conflict resolution algorithms are linear
Inefficient Match Algorithm
For each LHS, match each clause against each working memory element (WME), maintaining variable binding consistency among clauses within the same LHS
For each set of WMEs that successfully match the clauses on a LHS, add the instantiation, that is, the pair (Rule, WME-set), to the conflict set
After conflict resolution and act phases, repeat match with all LHSs and all WMEs including changes
Observation
Changes in WM result in changes in the conflict set
Therefore, only the changes in WM should be addressed in an efficient match algorithm
Rete Match Algorithm
Precompile rules into a dataflow network (rete) data structure
Nodes in the rete represent the match of one value
Nodes store partial instantiations
Leaves of the rete represent complete instantiations
Sample Rete(((agent (location =x) (holding =a)) (arrow (name =a)) (wumpus (location (& =y !x))) (room (name =x) (adjacent =y)))
((MODIFY 1 (holding nil)) (ASSERT shoot (arrow =a) (at =y))) )
class: agent
location: =x
holding: =a
class: arrow
name: =a
class: wumpus
location: =y !x
class: room
name: =x
adjacent: =y
P:(agent (location 1) (holding A))Q:(arrow (name A))R:(wumpus (location 2))S:(room (name 1) (adjacent 2))
R
R: =y 2
P,R: =x 1, =y 2
Q
Q: =a A
P,Q: =a A
P
P: =x 1
P: =a A
S
S: =x 1
S: =y 2
P,R,S: =x 1, =y 2
R,S: =y 2
P,Q,R,S: =a A, =x 1, =y 2
Improving Rule-Based System Performance
Performance and Parallelism
• many patterns and many facts result often result in poor performance of rule-based systems, even with state-saving algorithms such as rete
• since rule bases are collections of unordered rules, and working memories are collections of unordered facts (time tags used for recency heuristic, not to impose a specific, unbreakable order), parallel processing is an obvious approach for performance improvement
Relevance and Match Complexities
• relevance complexity — complexity of determining intracondition satisfaction of a rule, i.e. the constants
• match complexity — complexity of determining intercondition satisfaction of a rule, i.e. the variables
Rule Parallelism
• distribute one rule per processor, keeping only relevant working memory for each rule
• during match, process all rules simultaneously
• perform conflict resolution via a parallel resolve algorithm — perhaps in O(log n) time
• act phase broadcasts changes to all processors, absorbed by relevant rules
Problems with Rule Parallelism
• inherent sequential nature — even if a given cycle is faster, each cycle must be done in sequence
• temporal redundancy — small changes to working memory per cycle means little work to be done in each cycle
• culprit rules — certain rules require substantially more match time than others, leading to load imbalance among processors; these rules are high in match complexity
Considering Node Parallelism
• better approach is to distribute rete nodes across parallel processors rather than rules
• each processor has its local memory to store the partial instantiations
• however, the three problems still exist — inherent sequential nature, temporal redundancy, and culprit nodes (rather than rules)
Addressing Inherent Sequential Nature and Temporal Redundancy
• combining rule chains — rules which often fire in succession are combined into macro rules with more complex LHSs and resulting in more changes to working memory on their RHSs
• multiple rule firing for non-conflicting rules
• specifying rule sets — flow of control is abstracted out of rules, explicitly identifying sequential requirements
Combining Rule Chains
• if A, B, and C then make D and E
• if B and D then make F and remove D
becomes
• if A, B, and C then make E and F
• perhaps more rules created, but such rules are parallelizable
Addressing Culprit Rules/Nodes
• creating constrained copies of culprit rules/nodes — culprit rules/nodes are copied many times, each copy constrained to be relevant to a distinct subset of possible working memory elements
• this process results in a shift to increasing relevance complexity and decreasing match complexity — i.e. fewer variable binding tests (joins) and more constant tests (selects)
• relevance complexity is easily parallelized whereas match complexity causes load imbalance
(p join-pieces! (goal ^type try-join ^id1 nil ^id2 nil)! (piece ^color <x> ^id <i>)! (piece ^color <x> ^id { <j> <> <i> })! -->! (modify 1 ^id1 <i> ^id2 <j>))
Rete of Join-Pieces(p join-pieces! (goal ^type try-join ^id1 nil ^id2 nil)! (piece ^color <x> ^id <i>)! (piece ^color <x> ^id { <j> <> <i> })! -->! (modify 1 ^id1 <i> ^id2 <j>))
class = goal class = piece
type = try-join
id1 = nil
id2 = nil
natural join
join unequal ids
join equal colors
Analyzing the Culprit
• say n = 1000 puzzle pieces
• upon adding the goal of type try-join, n2 = 1,000,000 tests occur at join node
• suppose that node only looks at RED pieces and there are 20 colors, evenly distributed
(p join-pieces-RED! (goal ^type try-join ^id1 nil ^id2 nil)! (piece ^color RED ^id <i>)! (piece ^color RED ^id { <j> <> <i> })! -->! (modify 1 ^id1 <i> ^id2 <j>))
Analyzing the Culprit
• each copy is relevant to n ≈ 50 puzzle pieces
• upon adding the goal of type try-join, each copy only performs n2 = 2,500 tests and all are performed in parallel
• 400-fold speed improvement with 20 processors
(cc piece color)...(p join-pieces-1! (goal ^type try-join ^id1 nil ^id2 nil)! (piece ^color <X> ^hash-color 1 ^id <i>)! (piece ^color <X> ^hash-color 1 ^id { <j> <> <i> })! -->! (modify 1 ^id1 <i> ^id2 <j>))
• if only one occurrence of a variable, n copies
• if two or more occurrences of a variable, yet they are all tested to be equal to each other, still only n copies
• however, if m different variables occur, or if one color must be <x> whereas another must be <> <x> then nm copies
How Many Copies?
(cc piece color)...(p join-pieces-a,b! (goal ^type try-join ^id1 nil ^id2 nil)! (piece ^color <X> ^hash-color a ^id <i>)! (piece ^color <> <X> ^hash-color b ^id { <j> <> <i> })! -->! (modify 1 ^id1 <i> ^id2 <j>))
• in purely equijoin conditions, linear speed up, even without parallel processing — an additional linear speed up resulting in quadratic speed up with parallel processing
• in non-equijoin conditions, no speed up without parallelism, but linear speed up with parallel processing
Copy-Constrain Speed Up
Overall Performance Improvements
Uncertainty and Bayes’s Rule
Logic Theory vs Decision Theory
In propositional logic, propositional variables can be true or false in different models
In probabilistic decision theory, propositional variables have associated probabilities
Probabilities assign a degree of belief rather than a degree of truth
Random Variables
Boolean random variables: Cavity, Toothache, Catches (each = <true, false>)
P(Cavity=true) sometimes written as P(cavity)
Discrete random variables: Weather = <sunny, rainy, cloudy, snow>
P(Weather=sunny) sometimes written as P(sunny)
Continuous random variables: variables with real number values
Probability Axioms
0 ≤ P(a) ≤ 1
P(true) = 1; P(false) = 0
P(a ∨ b) = P(a) + P(b) - P(a ∧ b)
For discrete random variables ∑ P(V=v) = 1
Atomic EventsCavity Toothache Catches Atomic Event
T T T cavity ∧ toothache ∧ catches
T T F cavity ∧ toothache ∧ ¬catches
T F T cavity ∧ ¬toothache ∧ catches
T F F cavity ∧ ¬toothache ∧ ¬catches
F T T ¬cavity ∧ toothache ∧ catches
F T F ¬cavity ∧ toothache ∧ ¬catches
F F T ¬cavity ∧ ¬toothache ∧ catches
F F F ¬cavity ∧ ¬toothache ∧ ¬catches
P
0.108
0.012
0.072
0.008
0.016
0.064
0.144
0.576
Atomic Events are MECE; ∑ P(e) = 1
Prior and ConditionalProbabilities
The prior probability of a proposition is the degree of belief we have that it is true with no additional evidence — P(cavity) = 0.2
The conditional probability can change based on evidence — P(cavity|toothache) = 0.6
The Product Rule: P(a ∧ b) = P(b|a)P(a)
Inference and Independence
Exhaustive, complete inference algorithm simply calculates all conditional probabilities to find best next action — O(2n)
If some random variables can be deemed independent based on domain knowledge, they can be factored out
Bayes’s Rule
Since conjunction is commutative,P(a ∧ b) = P(a|b)P(b) = P(b|a)P(a)
Thus,P(b|a) = P(a|b)P(b)/P(a)
The conditional probability of b given a is equal to the conditional probability of a given b times the ratio of the prior probabilities of b and a
Using Bayes’s Rule
Known: if there is a cavity, there is a 90% chance that the dentist tool will catch
Known: prior probabilities of cavities and catches are 20% and 28% respectively
What is the probability of having a cavity if the tool catches?
P(cavity|catches) = P(catches|cavity)P(cavity)/P(catches) = 64%
Combining Evidence
P(Cause|Effect1∧Effect2∧...∧EffectN) is difficult to compute because we need the conditional probabilities of the conjunction for each value of Cause — O(2n)
However, if the Effects are independent of each other (even if they are all dependent on the cause), then you can computeP(Effect1∧Effect2∧...∧EffectN|Cause) = P(Effect1|Cause)P(Effect2|Cause)...P(EffectN|Cause)
Heuristic Search
Best-First Search
Use an evaluation function f(n) for each node
estimates “desirability”
Expand most desirable node in fringe
Enqueueing function maintains fringe in order of f(n) -- smallest (lowest cost) first
Two approaches: Greedy and A*
Romania
Map of roads between cities with distances (as used in uninformed search)
Straight-line distances to Bucharest from each city (as the crow flies)
Arad 366, Bucharest 0, Craiova 160, etc...
Greedy Best-First Search
We introduce h(n): a heuristic function that estimates the cost from n to goal
Evaluation function f(n) = h(n),
h(n) = straight line distance from state(n) to Bucharest
Greedy Best-First Search expands the node that appears to be closest to the goal
Bucharest0
Bucharest0
Fagaras176
Oradea380
Rimnicu193
Sibiu253
Timisoara329
Zerind374
Sibiu253
Fagaras176
Arad366
Greedy Best-First Search
Properties of Greedy Search
Complete? No (if tree search) – can get stuck in loops; Yes if repeated nodes are eliminated (graph search)
Time? O(bm), but a good heuristic dramatically improves performance
Space? O(bm), keeps all nodes in memory
Optimal? No. Greedy search is like heuristic depth-first
A* Search
Avoid expanding paths that are already expensive
f(n) = g(n) + h(n)
g(n) = path cost from initial to n
h(n) = estimated cost from n to goal
f(n) = estimated cost from initial to goal through n
Bucharest418=418+0
Craiova615=455+160
Bucharest418=418+0
Craiova526=366+160
Pitesti417=317+100
Sibiu553=300+253
Bucharest450=450+0
Sibiu591=338+253
Sibiu393=140+253
Timisoara447=118+329
Zerind449=75+374
Pitesti417=317+100
Arad646=280+366
Fagaras415=239+176
Oradea671=291+380
Rimnicu413=220+193
Rimnicu413=220+193
Sibiu393=140+253
Fagaras415=239+176
Arad366=0+366
A* Search
Admissible Heuristics
A heuristic h(n) is admissible if for every node n, h(n) <= h*(n), where h*(n) is the true cost to reach the goal from n
An admissible heuristic thus never overestimates the cost to reach the goal -- that is, it must be optimistic
For example, straight line distance is admissible
Theorem: if h(n) is admissible, A* using tree-search is optimal
Proof of Optimality of A*
Suppose some suboptimal goal G2 has been generated (in the fringe). Let n be an unexpanded node in the fringe such that n is on the shortest path to an optimal goal Gf(G2) = g(G2) since h(G2) = 0g(G2) > g(G) since G2 is suboptimalf(G) = g(G) since h(G) = 0f(G2) > f(G) from aboveh(n) <= h*(n) since h is admissibleg(n) + h(n) <= g(n) + h*(n)f(n) <= f(G)Hence f(G2) > f(n) so A* will never select G2 for expansion G2
n
G
S
A* Tree vs Graph Search
A* with admissible h is optimal for tree searchNot so for graph search – A* may discards repeated states even if cheaper routes to them (i.e. g(n)) are foundFix in two ways
modify graph search to check and replace repeated state nodes with cheaper alternativesleave graph search as is, but insist on consistent h(n)
Consistent Heuristics
A heuristic is consistent if for every node n, every successor n’ of n generated by action a, h(n) <= c(n,a,n’) + h(n’)Consistent heuristics satisfy triangularityDifficult to concoct an admissible yet inconsistent heuristicIf h is consistent, f(n’)
= g(n’) + h(n’)= g(n) + c(n,a,n’) + h(n’)>= g(n) + h(n)= f(n)
That is, f(n) is non-decreasing along any pathTheorem: If h(n) is consistent, A* using graph-search is optimal
Properties of A*
Complete? Yes (unless there are infinitely many nodes with f <= f(G))
Time? Exponential
Space? Keeps all nodes in memory
Optimal? Yes
Admissible HeuristicsFor example, the 8-puzzle
h1(n) = number of misplaced tiles
h2(n) = total Manhattan distance
7 2 45 68 3 1
1 23 4 56 7 8
h1(S) = h2(S) =
83+1+2+2+2+3+3+2 = 18
Dominance
For admissible heuristics h1 and h2, h2 dominates h1 if h2(n) >= h1(n) for all nTypical time complexities (number of expanded nodes) for 8-puzzle
d = 12IDS = 3,644,035A*(h1) = 227A*(h2) = 73
d = 24IDS = too manyA*(h1) = 39,135A*(h2) = 1,641
Relaxation
Finding heuristics systematically by relaxing a problem
A problem with fewer restrictions on actions is a relaxed problem
The cost of an optimal solution to the relaxed problem is an admissible heuristic for the original problem
For 8-puzzle, allowing tiles to move anywhere generates h1 and allowing tiles to move to any adjacent square generates h2
For Romania problem, straight line distance is a relaxation generating its heuristic
Local Search
Local Search: Goal = Solution
Integrated circuit design: How should the circuitry be laid out on the chip to optimize space and function?
Job-shop scheduling: How should resources (human or equipment) be allocated and scheduled optimally?
Portfolio management: How should financial assets be allocated to optimize investment goals in a market environment?
Local Search Algoritms
Path is irrelevant
State space is set of “complete” configurations
Find a configuration that satisfies constraints
Keep a single “current” state; try to improve it
N-Queens Problem
Put n queens on an n x n board with no two queens attacking each other
Q QQ Q
Q QQ
Q
Q Q
Hill-Climbing Search
“Like climbing a mountain in thick fog with amnesia”
Depending on initial state, can get stuck in local maxima
Hill-Climbing Search(defun hill-climb (state successorf evalf)
(let ((next (best (funcall successorf state) state evalf)))(if (eql state next)
state(hill-climb next successorf evalf))
))
(defun best (neighbors state evalf)(select-max
(make-node :value (funcall evalf state) :state state)(mapcar #’(lambda (s) (make-node :value (funcall evalf s) :state s)) neighbors)))
(defun select-max (best-so-far rest)(cond
((null rest) (node-state best-so-far))((> (node-value best-so-far) (node-value (car rest))) (select-max best-so-far (cdr rest)))(t (select-max (car rest) (cdr rest)))))
Hill-Climbing the 8-Queensh = number of pairs of attacking queens
successorf = move one queen along its column
18 12 14 13 13 12 14 1414 16 13 15 12 14 12 1614 12 18 13 15 12 14 1415 14 14 Q 13 16 13 16
Q 14 17 15 Q 14 16 1617 Q 16 18 15 Q 15 Q18 14 Q 15 15 14 Q 1614 14 13 17 12 14 12 18
h=17
A Local Minimum: h=1Q
Q
Local Beam Search
Keep track of k states rather than just one
Start with k randomly generated states
At each iteration, generate all successors of all k states
If any one is a goal, stop; else select the k best successors and repeat
Why is this different than just running hill-climbing k times in parallel?
Stochastic Local Beam Search
Like local beam search except select k next states randomly with probability proportional to value
Addresses clustering issues that arise from local beam search
Genetic Algorithms
A successor state is generated by combining two parent statesStart with k randomly generated states: the populationStates are represented as strings over a finite alphabetFitness function results with higher values for better statesProduce the next generation (next k states) by selection, crossover, mutation
Genetic 8-Queens
Fitness function = number of non-attacking pairs (goal = 28)
e.g. 24 ÷ (24+23+20+11) = 31%
Adversarial Search
Adversarial Search
Optimal decisions with the Minimax algorithm
αβ pruning
Imperfect decisions
Adversarial vs Search
Unpredictable opponent
explore moves for all possible replies
assume worst case (perfect opponent)
Time limits force suboptimal search
Game TreeTwo player game, deterministic, turns
Minimax Defined
Minimax(node) =
Utility(node) if terminal
maximum of Minimax of each successor if Max node
minimum of Minimax of each successor if Min node
Minimax AlgorithmPerfect play for deterministic games: Choose move with highest minimax value
This results in the best achievable payoff against best opponent
3 12 8 2 4 6 14 5 2
3
3
2 2
Minimax Algorithm
(pause for Lisp code)
Properties of Minimax
Complete (if tree is finite)
Optimal against optimal opponent
Time complexity = O(bm)
Space complexity = O(bm)
For chess, b ≈ 35, m ≈ 100; exact solution is infeasible
αβ pruning
keep track of best and worst possible minimax values and don’t pursue paths that cannot improve over these values
α is the value of the best choice found so far along the path for max
if value is worse than α, max-node will avoid it, pruning that branch
similarly define β for min
αβ pruning
3 12 8 2 14 5 2
≥3
≤2 ≤14≤5 ≤2
3
≤3 3
Properties of αβ pruning
Pruning does not effect final result — still complete and optimal
Good move ordering improves pruning effectiveness
With perfect ordering, time complexity becomes O(bm/2) effectively doubling search depth
Reasoning about relevant computations — i.e. metareasoning
Resource Limitations
In a timed game, suppose you have 100 seconds per move
Suppose we can explore 104 nodes/second
Thus 106 nodes/move
Must cut off the search at the appropriate depth, rather than using the complete tree
Use cutoff function for terminalp
Use evaluation function (heuristic) instead of utility for evalp
evaluation functions typically weighted sum of feature functions
∑ wifi(s)For chess, bm = 106 and b = 35 means m = 4; 4-ply look ahead is a terrible chess player
Games involving chance
E.g. BackgammonInsert chance nodes between max and min nodes
Minimax(n) =Utility(n) if terminalmax of minimax of successors if max nodemin of minimax of successors if min node∑ p(s) ⋅ minimax(s) if chance node