38
Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009

Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D. March 6, 2009

Embed Size (px)

Citation preview

Artificial Intelligence

CIS 342

The College of Saint RoseDavid Goldschmidt, Ph.D.

March 6, 2009

Crossword Puzzle Construction

Given:– Dictionary of valid words

and phrases– Empty crossword grid

Problem:– Fill the crossword grid such

that all words both acrossand down are valid

– Assign clues

Crossword Puzzle Construction

Depth-First Search (DFS)– Fill in words until a solution is found

or a dead-end is encountered– Backtrack from dead-ends

– Questions: Where do we start? What word do we fill in next? What backtracking strategies do we use? How do we avoid repetition (boring puzzles)?

Crossword Puzzle Construction

Optimize the DFS:– Add longer (most constrained) words first– Associate weights with words in dictionary

based on frequency of letters Friendly crossword puzzle words

include letters: S, R, E, T, D, A, I, L Unfriendly crossword puzzle words

include letters: J, Q, X, Z, F, V, W e.g. quiz, fix, jazz, quaff, xylophone, wax

1 01 0X1i

Generation i

0 01 0X2i

0 00 1X3i

1 11 0X4i

0 11 1X5i f = 56

1 00 1X6i f = 54

f = 36

f = 44

f = 14

f = 14

1 00 0X1i+1

Generation (i + 1)

0 01 1X2i+1

1 10 1X3i+1

0 01 0X4i+1

0 11 0X5i+1 f = 54

0 11 1X6i+1 f = 56

f = 56

f = 50

f = 44

f = 44

Crossover

X6i 1 00 0 01 0 X2i

0 01 0X2i 0 11 1 X5i

0X1i 0 11 1 X5i1 01 0

0 10 0

11 101 0

Mutation

0 11 1X5'i 01 0

X6'i 1 00

0 01 0X2'i 0 1

0 0

0 1 111X5i

1 1 1 X1"i1 1

X2"i0 1 0

0X1'i 1 1 1

0 1 0X2i

Crossword Puzzle Construction

Genetic Algorithm (GA)– Evolve a solution by crossovers and

mutations through many generations– Initial population of crossword grids:

Random letters? Random letters based on Scrabble® frequencies? Random words from dictionary?

– Fitness of each grid is number of valid words

Solving Crossword Puzzles

Given:– Crossword grid – Clues

Problem:– Fill the grid such

that all words correctly answerthe given clues

Solving Crossword Puzzles

Obtain candidate answers for each clue– Assign a confidence value to each candidate– Are we guaranteed to have the correct

answer?

Place candidate answers in grid until a solutionis found or a dead-end occurs– Which backtracking strategies

should we use?

Solving Crossword Puzzles

PROVERB — Duke University, 1999– Modules provide candidate answers

from dictionaries, encyclopedias,movie databases, etc.

– Module sources a Crossword Puzzle Database ofexactly 5142 previously solved puzzles

Pivotal in PROVERB’s success

– Another module generates all combinationsof letters (ouch!)

Solving Crossword Puzzles

Google CruciVerbalist (GCV)

Solving Crossword Puzzles

GCV solved 13x13 puzzle with 68 clues– Many clues are fill-in-the-blank

or pop-culture clues– Candidate answers

obtained from Googleresults page (top 50)

– Solved using 559 Google queries– Queries yielded 68 correct answers

44 correct answers had highest confidence

Solving Crossword Puzzles

Clue Preprocessing

Categorize clues based on text and type of clues:– Fill-in-the-blank clues– Synonyms/Antonyms– “Type of” (or “Kind of”) clues– Abbreviations– Clues with “and” or “or”– Singular or plural– Number of words in answer

Clue Preprocessing

Translate clues to Google-friendly forms– “To ___ is human”

“To * is human” “To * * is human”

– “Mary ___ little lamb” (2 words) “Mary * * little lamb”

– “___ to Joy” by Beethoven “* to Joy” by Beethoven “* * to Joy” by Beethoven

Clue Preprocessing

Translate clues to Google-friendly forms– Diplomacy

synonyms of Diplomacy

– Not dry opposite of dry antonyms of dry

– Joy synonyms of Joy

Clue Preprocessing

Translate clues to Google-friendly forms– Type of dancing [or Kind of dancing]

* dancing

– Second sight (abbr.) Second sight abbreviations of Second sight

– Superman’s admirer admirer of Superman

Clue Preprocessing

Translate clues to Google-friendly forms– Couldn’t move

Could not move Could opposite of move Could antonyms of move

– Knight or Danson Knight Danson

Clue Preprocessing

Translate clues to Google-friendly forms– Bosley and Arnold

Bosley Arnold Append an ‘s’

– Henson, and others [or Henson, and namesakes]

Henson Append an ‘s’

Results of Google-Querying

Results of Google-Querying

GCV excels at solving fill-in-the-blank and pop-culture clues– Why?

Though results are encouraging,using keyword-based searchingis limited– Why?

Populating the Crossword Grid

Use a Depth-First Search (DFS) algorithm:– Fill in the crossword grid based on confidence

values of candidate words– At each iteration:

Select candidate word with highest confidence valueamongst clues not yet placed

Attempt to fit candidate word into grid

– Halt when a solution is found or a dead-end occurs

Populating the Crossword Grid

When a dead-end occurs, what do we do?– Backtrack: Remove last word placed in grid

Disadvantages?

– Backjump: Identify culprit and remove all wordsback to culprit word

Disadvantages?

Populating the Crossword Grid

When a dead-end occurs, what do we do?– Extricating Backjump: Identify and remove

the culprit Disadvantages?

– How do we identifythe culprit?

Extricating Backjumping

Assign weights to the squares of the grid– Square weights correspond to confidence

valuesof candidate words placed

– e.g. Place TWAIN withconfidence value of 10at 5-Across

Extricating Backjumping

Weights of interlocking words are multiplied

Extricating Backjumping

Define grid weight of a word as the sum of each individual square weight

– e.g. TWAIN = 100, NOW = 72

Extricating Backjumping

When a dead-end occurs, the culprit is theword with the lowest grid weight

A Sampling of Crossword Puzzles

A Sampling of Crossword Puzzles

New York Times

A Sampling of Crossword Puzzles

A Sampling of Crossword Puzzles

TV Guide #42

A Sampling of Crossword Puzzles

A Sampling of Crossword Puzzles

TV Guide #63

A Sampling of Crossword Puzzles

A Sampling of Crossword Puzzles

Mensa Kids Puzzle #3

Results of Grid Solving

Limitations of Keyword-Based Search

Google and GCV use keyword-based tricksto artificially improve result sets– Word frequency & proximity to other words– Additional keywords to help direct queries to

good candidate answers e.g. synonyms of

– Grammatical and structural rearrangements

Lack of precision in keyword-based search– Irrelevant results in candidate answer lists– Confidence values based on word

frequencyproduces many false positives

– Correct answer is often buried in other mediocre(and incorrect!) candidates

Limitations of Keyword-Based Search

In Conclusion....

Other uses of theWeb as an automatedinformation source?– Keyword-based search

is insufficient– Lacks the means for

machine-interpretableinformation

– Semantic Web