Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane...

Preview:

Citation preview

Finite-state automata 3

MorphologyDay 14

LING 681.02Computational Linguistics

Harry HowardTulane University

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

2

Course organization

http://www.tulane.edu/~ling/NLP/NLTK is installed on the computers in this

room!How would you like to use the Provost's

$150?

SLP §2.2 Finite-state automata

2.2.6 Recognition as search

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

4

Non-deterministic recognition: Search

In a non-deterministic FSA, there is at least one path through the machine for a string that is in the language defined by the machine.

There is no path through the machine that leads to an accept state for a string not in the language.

But not all paths directed through the machine for an accept string lead to an accept state.

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

5

Non-deterministic recognition

So success in non-deterministic recognition occurs when a path is found through the machine that ends in an accept.

Failure occurs when all of the possible paths for a given string lead to failure.

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

6

Back to the example

b a a a ! $

q0 q1 q2 q2 q3 q4

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

7

Exampleq

0

b a a a !q

1

b a a a !q

2

b a a a !

q

2

b a a a !q

2

b a a a !

X

q

3

b a a a !q

4

b a a a !

1

2

3

4

5

6

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

8

Summary

States in the search space are pairings of tape positions and states in the machine.

By keeping track of as yet unexplored states, a recognizer can systematically explore all the paths through the machine given an input.

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

9

Keeping track

But how do you keep track?Depth-first/last in first out (LIFO)/stack

Unexplored states are added to the front of the agenda, and they are explored by going to the most recent.

Breadth-first/first in first out (FIFO)/queueUnexplored states are added to the back of the

agenda, and they are explored by going to the most recent.

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

10

Depth-first/LIFO/stackq2

q18q12

q41

q27

q2

q12

q27

q50

q31

stack

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

11

Breadth-first/FIFO/queue

q2

q18q12

q41

q27

q2 q12 q27

q50

q31

queue

SLP §2.2 Finite-state automata

2.2.7 Comparison

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

13

Equivalence

Non-deterministic machines can be converted to deterministic ones with a fairly simple construction.

That means that they have the same power:non-deterministic machines are not more

powerful than deterministic ones in terms of the languages they can accept.

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

14

Why bother?

Non-determinism doesn’t get us more formal power and it causes headaches, so why bother?

More natural (understandable) solutions.

SLP §3 Words and transducers

Intro

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

16

Concepts and terminology

study of spelling study of word composition to build a structured

representation of a word or sentence

input to this process a process that applies

without limitations Can all forms be stored in

advance?

orthographymorphologyparsingsurface or input formproductive

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

17

Concepts and terminology

the minimal meaning-bearing unit in a language

the main unit additional units a unit that:

precedes the main one follows the main one surrounds the main one is inserted within the main one

a language in which the main unit can have many additional units

morphemestemaffixprefixsuffixcircumfixinfixagglutinative

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

18

Concepts and terminology

Combining an affix to a stem does not change the part of speech of the stem.

Combining an affix to a stem DOES change the part of speech of the stem.

Combining multiple stems.Combining a stem with a

phonologically reduced stem.

inflectionderivationcompoundingcliticization

SLP §3 Words and transducers

§3.1 Survey of (mostly) English morphology

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

20

Inflectional morphology

stem -s -ing preterite past part.

walk walks walking walked walked

try tries trying tried tried

map maps mapping mapped mapped

eat eats eating ate eaten

catch catches catching caught caught

be is being was been

Next time

P4

SLP §3.2ff